ADAPTS: Automated deconvolution augmentation of profiles for tissue specific cells

Autoři: Samuel A. Danziger aff001;  David L. Gibbs aff002;  Ilya Shmulevich aff002;  Mark McConnell aff001;  Matthew W. B. Trotter aff001;  Frank Schmitz aff001;  David J. Reiss aff001;  Alexander V. Ratushny aff001
Působiště autorů: Celgene Corporation, Seattle, Washington, United States of America aff001;  Institute for Systems Biology, Seattle, Washington, United States of America aff002;  Celgene Institute for Translational Research Europe, Seville, Sevilla, Spain aff003
Vyšlo v časopise: PLoS ONE 14(11)
Kategorie: Research Article
doi: 10.1371/journal.pone.0224693


Immune cell infiltration of tumors and the tumor microenvironment can be an important component for determining patient outcomes. For example, immune and stromal cell presence inferred by deconvolving patient gene expression data may help identify high risk patients or suggest a course of treatment. One particularly powerful family of deconvolution techniques uses signature matrices of genes that uniquely identify each cell type as determined from single cell type purified gene expression data. Many methods from this family have been recently published, often including new signature matrices appropriate for a single purpose, such as investigating a specific type of tumor. The package ADAPTS helps users make the most of this expanding knowledge base by introducing a framework for cell type deconvolution. ADAPTS implements modular tools for customizing signature matrices for new tissue types by adding custom cell types or building new matrices de novo, including from single cell RNAseq data. It includes a common interface to several popular deconvolution algorithms that use a signature matrix to estimate the proportion of cell types present in heterogenous samples. ADAPTS also implements a novel method for clustering cell types into groups that are difficult to distinguish by deconvolution and then re-splitting those clusters using hierarchical deconvolution. We demonstrate that the techniques implemented in ADAPTS improve the ability to reconstruct the cell types present in a single cell RNAseq data set in a blind predictive analysis. ADAPTS is currently available for use in R on CRAN and GitHub.

Klíčová slova:

Adipocytes – Algorithms – Cell differentiation – Gene expression – Immune cells – Myelomas – Plasma cells


1. Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Yang THO, et al. The Immune Landscape of Cancer. Immunity. 2018;48(4):812–830.e14. doi: 10.1016/j.immuni.2018.03.023 29628290

2. Newman AM, Steen CB, Liu CL, Gentles AJ, Chaudhuri AA, Scherer F, et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nature Biotechnology. 2019; p. 1.

3. Erkkilä T, Lehmusvaara S, Ruusuvuori P, Visakorpi T, Shmulevich I, Lähdesmäki H. Probabilistic analysis of gene expression measurements from heterogeneous tissues. Bioinformatics. 2010;26(20):2571–2577. doi: 10.1093/bioinformatics/btq406 20631160

4. Lähdesmäki H, Shmulevich L, Dunmire V, Yli-Harja O, Zhang W. In silico microdissection of microarray data from heterogeneous cell populations. BMC Bioinformatics. 2005;6(1):54. doi: 10.1186/1471-2105-6-54 15766384

5. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nature Methods. 2015;12(5):453–457. doi: 10.1038/nmeth.3337 25822800

6. Vallania F, Tam A, Lofgren S, Schaffert S, Azad TD, Bongen E, et al. Leveraging heterogeneity across multiple datasets increases cell-mixture deconvolution accuracy and reduces biological and technical biases. Nature Communications. 2018;9(1):4735. doi: 10.1038/s41467-018-07242-6 30413720

7. Ciavarella S, Vegliante MC, Fabbri M, De Summa S, Melle F, Motta G, et al. Dissection of DLBCL microenvironment provides a gene expression-based predictor of survival applicable to formalin-fixed paraffin-embedded tissue. Annals of Oncology. 2018;29(12):2363–2370. doi: 10.1093/annonc/mdy450 30307529

8. Bianchi G, Munshi NC. Pathogenesis beyond the cancer clone(s) in multiple myeloma. Blood. 2015;125(20):3049–3058. doi: 10.1182/blood-2014-11-568881 25838343

9. Athar A, Füllgrabe A, George N, Iqbal H, Huerta L, Ali A, et al. ArrayExpress update—from bulk to single-cell expression data. Nucleic acids research. 2019;47(D1):D711–D715. doi: 10.1093/nar/gky964 30357387

10. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Research. 2013;41(D1):D991–D995. doi: 10.1093/nar/gks1193 23193258

11. Gaujoux R, Seoighe C. CellMix: a comprehensive toolbox for gene expression deconvolution. Bioinformatics. 2013;29(17):2211–2212. doi: 10.1093/bioinformatics/btt351 23825367

12. Belsley DA, Kuh E, Welsch RE. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons; 2005.

13. Dégot J. A condition number theorem for underdetermined polynomial systems. Mathematics of Computation. 2001;70(233):329–335.

14. Beyer H. Tukey, John W.: Exploratory Data Analysis. Addison-Wesley Publishing Company Reading, Mass.—Menlo Park, Cal., London, Amsterdam, Don Mills, Ontario, Sydney 1977, XVI, 688 S. Biometrical Journal. 1981;23(4):413–414.

15. Altboum Z, Steuerman Y, David E, Barnett-Itzhaki Z, Valadarsky L, Keren-Shaul H, et al. Digital cell quantification identifies global immune cell dynamics during influenza infection. Molecular Systems Biology. 2014;10(2). doi: 10.1002/msb.134947 24586061

16. Gong T, Szustakowski JD. DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics. 2013;29(8):1083–1085. doi: 10.1093/bioinformatics/btt090 23428642

17. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9(1):559. doi: 10.1186/1471-2105-9-559 19114008

18. Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biology. 2017;18. doi: 10.1186/s13059-017-1349-1 29141660

19. Wang X, Park J, Susztak K, Zhang NR, Li M. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nature Communications. 2019;10(1):380. doi: 10.1038/s41467-018-08023-x 30670690

20. Li Z, Wu H. TOAST: improving reference-free cell composition estimation by cross-cell type differential analysis. Genome Biology. 2019;20(1):190. doi: 10.1186/s13059-019-1778-0 31484546

21. Mahévas M, Patin P, Huetz F, Descatoire M, Cagnard N, Bole-Feysot C, et al. B cell depletion in immune thrombocytopenia reveals splenic long-lived plasma cells. The Journal of Clinical Investigation. 2013;123(1):432–442. doi: 10.1172/JCI65689 23241960

22. Torrente A, Lukk M, Xue V, Parkinson H, Rung J, Brazma A. Identification of Cancer Related Genes Using a Comprehensive Map of Human Gene Expression. PLOS ONE. 2016;11(6):e0157484. doi: 10.1371/journal.pone.0157484 27322383

23. Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nature Biotechnology. 2018;36(5):411–420. doi: 10.1038/nbt.4096 29608179

24. Segerstolpe Å, Palasantza A, Eliasson P, Andersson EM, Andréasson AC, Sun X, et al. Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes. Cell Metabolism. 2016;24(4):593–607. doi: 10.1016/j.cmet.2016.08.020 27667667

25. Gajewski TF, Schreiber H, Fu YX. Innate and adaptive immune cells in the tumor microenvironment. Nature Immunology. 2013;14(10):1014–1022. doi: 10.1038/ni.2703 24048123

Článek vyšel v časopise


2019 Číslo 11