All of gene expression (AOE): An integrated index for public gene expression databases


Autoři: Hidemasa Bono aff001
Působiště autorů: Database Center for Life Science (DBCLS), Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Mishima,Japan aff001
Vyšlo v časopise: PLoS ONE 15(1)
Kategorie: Research Article
doi: 10.1371/journal.pone.0227076

Souhrn

Gene expression data have been archived as microarray and RNA-seq datasets in two public databases, Gene Expression Omnibus (GEO) and ArrayExpress (AE). In 2018, the DNA DataBank of Japan started a similar repository called the Genomic Expression Archive (GEA). These databases are useful resources for the functional interpretation of genes, but have been separately maintained and may lack RNA-seq data, while the original sequence data are available in the Sequence Read Archive (SRA). We constructed an index for those gene expression data repositories, called All Of gene Expression (AOE), to integrate publicly available gene expression data. The web interface of AOE can graphically query data in addition to the application programming interface. By collecting gene expression data from RNA-seq in the SRA, AOE also includes data not included in GEO and AE. AOE is accessible as a search tool from the GEA website and is freely available at https://aoe.dbcls.jp/.

Klíčová slova:

Archives – Data visualization – Database searching – Gene expression – Genomic databases – Sequence databases – Transcriptome analysis – Web-based applications


Zdroje

1. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 29, 365–371 (2001). doi: 10.1038/ng1201-365 11726920

2. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995 (2013). doi: 10.1093/nar/gks1193 23193258

3. Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E et al. ArrayExpress update—simplifying data submissions. Nucleic Acids Res. 43, D1113–1116 (2015). doi: 10.1093/nar/gku1057 25361974

4. Karsch-Mizrachi I., Takagi T., & Cochrane G. The international nucleotide sequence database collaboration. Nucleic Acids Res. 46, D48–D51 (2018) doi: 10.1093/nar/gkx1097 29190397

5. Kodama Y, Mashima J, Kosuge T, Ogasawara O DDBJ update: the Genomic Expression Archive (GEA) for functional genomics data. Nucleic Acids Res. 47, D69–D73 (2019). doi: 10.1093/nar/gky1002 30357349

6. Kodama Y, Shumway M, Leinonen R. International Nucleotide Sequence Database Collaboration. The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res. 40, D54–D56 (2012). doi: 10.1093/nar/gkr854 22009675

7. Ohta T, Nakazato T, Bono H. Calculating quality of public high-throughput sequencing data to obtain suitable subset for reanalysis from the Sequence Read Archive GigaScience, 6, gix029 (2017). doi: 10.1093/gigascience/gix029 28449062

8. Kawano S, Ono H, Takagi T, Bono H. Tutorial videos of bioinformatics resources: online distribution trial in Japan named TogoTV. Brief Bioinform. 13, 258–268 (2012) doi: 10.1093/bib/bbr039 21803786

9. Perez-Riverol Y, Bai M, da Veiga Leprevost F, Squizzato S, Park YM, Haug K et al. Discovering and linking public omics data sets using the Omics Discovery Index. Nat. Biotech. 35, 406–409 (2017)

10. Ono H, Ogasawara O, Okubo K, Bono H. RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes. Sci. Data 4, 170105 (2017) doi: 10.1038/sdata.2017.105 28850115

11. Rayner TF, Rocca-Serra P, Spellman PT, Causton HC, Farne A, Holloway E et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics. 7, 489 (2006). doi: 10.1186/1471-2105-7-489 17087822


Článek vyšel v časopise

PLOS One


2020 Číslo 1