Introduction

The small RNA Expression Atlas (SEA) is a a repository of small RNA (sRNA) expression datasets. These expression datasets are systematically annotated with metadata. For instance, biological metadata includes standardized information about the organism, cell line, cell type, tissue type, potential diseases and more. Additional annotations include experimental details about instrument models and library strategies. All the data was analysed with the Oasis pipelines to achieve a comparison of small RNA expression across many studies.

SEA can be searched for sRNAs that originate from miRBase, Ensembl as well as from the repository of novel predicted miRNAs from Oasis. A search can be performed with individual search terms as well as combinations thereof.

The central feature of SEA is the powerful ontology-based search which allows the user to easily find datasets that are relevant to their research.

(Please note: The screenshots you see in this manual have been edited in order to save screen space. When you use SEA you will likely see slightly different visual results)

SEArching with SEA

All searches for datasets or sRNAs start by using the search bar on the start page. When you start typing you should see search suggestions popping up. Figure 1 shows what it looks like when you start to enter "skin" into the search bar.

SEA search suggestions for input "sk"
Figure 1: Search suggestions for the input "sk". The user can either select specific small RNA molecules or search terms for diseases and tissues

As you can see, the search suggestions come from different categories. Overall SEA supports the following search categories:

Please note that SEA only allows searches based on suggested terms. All terms that are suggested to you (while typing) are guaranteed to be in the database. If you are looking for a specific disease/tissue/organism/etc. and no matching terms are suggested, then there is no dataset with your criteria in the database.

If you now select skin from the suggestions list, you will see that tissue:skin it has been added to the search bar (figure 2).

Search bar with inserted search term tissue:skin
Figure 2: The search term tissue:skin has been added into the search bar. You can now either press enter (while focussing the search bar) to start a search for datasets that contain skin samples or you can add additional search terms.

Combining several search terms

When using several search terms, a dataset will be found if the following rules are satisfied:

search expression example
Figure 3: A complex search expression using search terms from different categories: Two different tissues, one disease and one organism. If you want to try it out for yourself, you can add the search terms to the search bar.

For example, let us assume we search for samples from tissue or muscle in human Psoriasis patients (see Figure 3). Once we hit the enter key in the search bar, we will go to the search results page (Figure 4). At the top you will see your search query again. You can see that SEA searched for skin or muscle tissue. Given that Psoriasis is a disease of the skin, it is not surprising that we do not find any datasets that contain muscle tissue.

Search results
Figure 4: Results of search query. Note that the search query is also displayed above the search results. Live results .

Searching with ontologies

Each dataset in the SEA database is annotated with terms that come from ontologies. In simple words, an ontology is a list of relationships between words. For example, if we take the words human and mammal, we can say that a human is a mammal. And not only humans are mammals, but mice, dogs, dolphins and pigs are mammals too. But it does not end there. All mammals are also vertebrates. And all vertebrates are chordates.

Ontologies are not only restricted to organisms. Many more ontologies have been defined by independent organisations. When you use the search in SEA, all datasets will be found that match the search term but also all subterms as they are defined in the ontologies. For example, if you search for neurodegenerative disease, you will get search results from Alzheimer's and Huntington's disease. If you search for murinae you will get datasets from mice as well as from rats. This way you can be as broad or as specific with your search as you wish.

Working with the SEArch results

When you work with SEA, you will most likely be in one of the following situations:
  1. You are interested in a specific sRNA and you want to obtain expression datasets that contain this sRNA.
  2. You are interested in a specific tissue/disease/cell-line and you want to obtain sRNAs that are relevant in this tissue/disease/cell-line.
  3. You know which sRNA and which tissue/disease/cell-line you want to do research on. Now you simply want datasets that feature this sRNA in this particular tissue/disease/cell-line.
(Your starting point might also involve more than one sRNA and/or tissue. Feel free to add more search terms in this case.)

Keep in mind that SEA keeps track of your search/filtering criteria as you go through the results. If you select a specific sRNA, a specific dataset and/or a specific tissue, all subsequent diagrams and tables will be based only on datasets that fulfill your criteria.

The following examples will use the microRNA hsa-miR-3179 and the tissue skin. Please chose your starting point:

  1. Search only by small RNA ID
  2. Search only by dataset criteria (tissue, disease, cell-type, ...)
  3. Search by small RNA ID and additional dataset criteria

Small RNA overview

You have searched for a specific sRNA molecule. The search results should look like figure 5:

small RNA details
Figure 5: Details on a small RNA molecule.
Top: Basic genomic information on the small RNA molecule. Further information can be obtained from mirbase.org when clicking on the link in the Details row.
Middle: Search query (as a reminder of your search input)
Bottom: Median Expression levels of the sRNA molecule. The expression levels are shown as the log2 of the median RPM (reads per million). You can switch between different expression views by clicking on "cell lines", "cell types", "tissues", diseases". If you have searched for a specific sRNA without any further criteria, you may also select No Expression Datasets to get a list of datasets where the current sRNA is not expressed at all. Each row corresponds to one dataset. Keep in mind that each dataset usually contains several samples with several measurements for the same sRNA molecule.
You also have the possibility to export the diagram either in HTML or comma-separated format when you click on Chart options
If you like you can also explore a live version of the above figure.

Dataset search results

You have searched with specific dataset criteria. Your search results should look like figure 6:

Datasets matching the search query
Figure 6: Datasets matching your search criteria. Above the table you see your search query again. Then the table has several columns that describe different aspects of your dataset. In addition, there are links in the GSE ID, Expressed sRNAs and Analysis output Columns. The GSE ID is the GEO SEries ID of the dataset (consisting of several samples which in turn have GSM IDs). When you click on GSE ID or on Expressed sRNAs you will get to a page with a detailed small RNA listing for the dataset. If you click on Analysis output you will get to a page that describes the quality and distribution of the reads in the dataset (produced by Oasis).
You may also explore a live version of the above figure if you like.

SmallRNAs in a Dataset

If you selected a specific dataset you should see something similar to figure 7:

List of small RNAs contained in the dataset
Figure 7: Table with small RNAs in the selected dataset. Each row corresponds to one GEO dataset. The first column is a simple enumeration of the search hits. Please note the buttons at the bottom of the table for displaying further sRNA results. The second column lists the ID of the small RNA. When you click on the link in the third column (GSE ID), you will get to a page that displays an overview of the expression of the selected small RNA molecule within the samples of the current dataset. Similarly, the link in the fourth column (Expression/All Queried Studies) will get you to a page that displays an overview of the expression of the selected small RNA molecule across all datasets that satisfy your search criteria.. The last column lists the number of samples in each dataset.
  • A: Name of the selected dataset.
  • B: Link to an overview on the read distribution and read quality values of the dataset.
  • C: A filter form to quickly find a sRNA molecule of interest in the table.

You may also explore a live version of the above figure if you like.

Expression of sRNA molecule within one dataset

If you choose to investigate the expression of a single sRNA molecule in a single dataset you get something similar to figure 8:

Overview of expression of a sRNA molecule within one dataset (several samples)
Figure 8:
  • A: Name of the dataset
  • B: Link to an overview on the read distribution and read quality values of the dataset.
  • C: Table with basic genomic information on the small RNA molecule.
  • D: Chart Options: Export the table as HTML file or as comma-separated file.
  • E: Table of all the samples in the dataset. N/A values signify information that is not available/applicable for this dataset.
  • F: Expression counts of the selected small RNA molecule in different samples within the chosen dataset. Each row corresponds to one sample within the chosen dataset (GSM=GEO sample).

You may also explore a live version of the above figure if you like.

Appendix

Small RNA identifiers

SEA uses standard small RNA identifiers for the search. The user should keep in mind that different types of small RNAs have different conventions when it comes to identifiers. For instance, microRNA IDs usually start with the species code that they are derived from. For example, a human microRNA usually starts with hsa-. The situation is similar to Piwi-interacting RNAs (piRNA). But instead of a dash the identifiers use an underscore: hsa_. Small nucleolar RNAs (snoRNAs) IDs tend to start with SNO and ribsomal RNA (rRNA) IDs usually start with a small r.