Quartet-based inference of cell differentiation trees from ChIP-Seq histone modification data

Autoři: Nazifa Ahmed Moumi aff001;  Badhan Das aff001;  Zarin Tasnim Promi aff001;  Nishat Anjum Bristy aff001;  Md. Shamsuzzoha Bayzid aff001
Působiště autorů: Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh aff001
Vyšlo v časopise: PLoS ONE 14(9)
Kategorie: Research Article
doi: 10.1371/journal.pone.0221270


Understanding cell differentiation—the process of generation of distinct cell-types—plays a pivotal role in developmental and evolutionary biology. Transcriptomic information and epigenetic marks are useful to elucidate hierarchical developmental relationships among cell-types. Standard phylogenetic approaches such as maximum parsimony, maximum likelihood and neighbor joining have previously been applied to ChIP-Seq histone modification data to infer cell-type trees, showing how diverse types of cells are related. In this study, we demonstrate the applicability and suitability of quartet-based phylogenetic tree estimation techniques for constructing cell-type trees. We propose two quartet-based pipelines for constructing cell phylogeny. Our methods were assessed for their validity in inferring hierarchical differentiation processes of various cell-types in H3K4me3, H3K27me3, H3K36me3, and H3K27ac histone mark data. We also propose a robust metric for evaluating cell-type trees.

Klíčová slova:

Blood – Cell differentiation – Epigenetics – Fibroblasts – Lung development – Phylogenetic analysis – Phylogenetics – principal component analysis


1. Kin K, Nnamani MC, Lynch VJ, Michaelides E, Wagner GP. Cell-type phylogenetics and the origin of endometrial stromal cells. Cell reports. 2015;10(8):1398–1409. doi: 10.1016/j.celrep.2015.01.062 25732829

2. Graf T, Enver T. Forcing cells to change lineages. Nature. 2009;462(7273):587. doi: 10.1038/nature08533 19956253

3. Koyanagi KO. Inferring cell differentiation processes based on phylogenetic analysis of genome-wide epigenetic information: hematopoiesis as a model case. Genome biology and evolution. 2015;7(3):699–705. doi: 10.1093/gbe/evv024 25638259

4. Gifford CA, Ziller MJ, Gu H, Trapnell C, Donaghey J, Tsankov A, et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell. 2013;153(5):1149–1163. doi: 10.1016/j.cell.2013.04.037 23664763

5. Kin K. Inferring cell type innovations by phylogenetic methods—concepts, methods, and limitations. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution. 2015;324(8):653–661. doi: 10.1002/jez.b.22657

6. Nair NU, Lin Y, Manasovska A, Antic J, Grnarova P, Sahu AD, et al. Study of cell differentiation by phylogenetic analysis using histone modification data. BMC bioinformatics. 2014;15(1):269. doi: 10.1186/1471-2105-15-269 25104072

7. Rivera CM, Ren B. Mapping human epigenomes. Cell. 2013;155(1):39–55. doi: 10.1016/j.cell.2013.09.011 24074860

8. Lee JH, Hart SRL, Skalnik DG. Histone deacetylase activity is required for embryonic stem cell differentiation. genesis. 2004;38(1):32–38. doi: 10.1002/gene.10250 14755802

9. Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature. 2011;471(7336):68. doi: 10.1038/nature09798 21289626

10. Lobe CG. 9 Transcription Factors and Mammalian Development. In: Current topics in developmental biology. vol. 27. Elsevier; 1992. p. 351–383. doi: 10.1016/S0070-2153(08)60539-6

11. Berger SL. Histone modifications in transcriptional regulation. Current opinion in genetics & development. 2002;12(2):142–148. doi: 10.1016/S0959-437X(02)00279-4

12. Martin C, Zhang Y. Mechanisms of epigenetic inheritance. Current opinion in cell biology. 2007;19(3):266–272. doi: 10.1016/j.ceb.2007.04.002 17466502

13. Arendt D. The evolution of cell types in animals: emerging principles from molecular studies. Nature Reviews Genetics. 2008;9(11):868. doi: 10.1038/nrg2416 18927580

14. Nair NU, Lin Y, Bucher P, Moret BM. Phylogenetic analysis of cell types using histone modifications. In: International Workshop on Algorithms in Bioinformatics. Springer; 2013. p. 326–337.

15. Nair NU, Hunter L, Shao M, Grnarova P, Lin Y, Bucher P, et al. A maximum-likelihood approach for building cell-type trees by lifting. In: BMC genomics. vol. 17. BioMed Central; 2016. p. 14. doi: 10.1186/s12864-015-2297-3

16. Bryder D, Rossi DJ, Weissman IL. Hematopoietic stem cells: the paradigmatic tissue-specific stem cell. The American journal of pathology. 2006;169(2):338–346. doi: 10.2353/ajpath.2006.060312 16877336

17. Pronk CJ, Rossi DJ, Månsson R, Attema JL, Norddahl GL, Chan CKF, et al. Elucidation of the phenotypic, functional, and molecular topography of a myeloerythroid progenitor cell hierarchy. Cell stem cell. 2007;1(4):428–442. doi: 10.1016/j.stem.2007.07.005 18371379

18. Villadsen R, Fridriksdottir AJ, Rønnov-Jessen L, Gudjonsson T, Rank F, LaBarge MA, et al. Evidence for a stem cell hierarchy in the adult human breast. The Journal of cell biology. 2007;177(1):87–101. doi: 10.1083/jcb.200611114 17420292

19. Allman ES, Degnan JH, Rhodes JA. Identifying the Rooted Species Tree from the Distribution of Unrooted Gene Trees under the Coalescent. J Math Biol. 2011;62(6):833–862. doi: 10.1007/s00285-010-0355-7 20652704

20. Degnan JH. Anomalous unrooted gene trees. Systematic Biology. 2013;62(4):574–590. doi: 10.1093/sysbio/syt023 23576318

21. Degnan JH, Rosenberg NA. Discordance of species trees with their most likely gene trees. PLoS Genetics. 2006;2:762–768. doi: 10.1371/journal.pgen.0020068

22. Ané C, Larget B, Baum DA, Smith SD, Rokas A. Bayesian estimation of concordance among gene trees. Mol Biol Evol. 2007;24:412–426. doi: 10.1093/molbev/msl170 17095535

23. Reaz R, Bayzid MS, Rahman MS. Accurate phylogenetic tree reconstruction from quartets: A heuristic approach. PLoS One. 2014;9(8):e104008. doi: 10.1371/journal.pone.0104008 25117474

24. Snir S, Rao S. Quartets MaxCut: A Divide and Conquer Quartets Algorithm. IEEE/ACM Trans Comput Biol Bioinform. 2010;7(4):704–718. doi: 10.1109/TCBB.2008.133 21030737

25. Mirarab S, Reaz R, Bayzid MS, Zimmermann T, Swenson MS, Warnow T. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics. 2014;30(17):i541–i548. doi: 10.1093/bioinformatics/btu462 25161245

26. Avni E, Cohen R, Snir S. Weighted quartets phylogenetics. Systematic biology. 2014;64(2):233–242. doi: 10.1093/sysbio/syu087 25414175

27. Wilkinson M. Majority-rule reduced consensus trees and their use in bootstrapping. Molecular Biology and evolution. 1996;13(3):437–444. doi: 10.1093/oxfordjournals.molbev.a025604 8742632

28. Aberer AJ, Krompass D, Stamatakis A. Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Systematic biology. 2012;62(1):162–166. doi: 10.1093/sysbio/sys078 22962004

29. Aberer AJ, Stamatakis A. A simple and accurate method for rogue taxon identification. In: 2011 IEEE International Conference on Bioinformatics and Biomedicine. IEEE; 2011. p. 118–122.

30. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome biology. 2008;9(9):R137. doi: 10.1186/gb-2008-9-9-r137 18798982

31. Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nature biotechnology. 2009;27(1):66. doi: 10.1038/nbt.1518 19122651

32. John S, Sabo PJ, Thurman RE, Sung MH, Biddie SC, Johnson TA, et al. Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nature genetics. 2011;43(3):264. doi: 10.1038/ng.759 21258342

33. Zheng R, Wan C, Mei S, Qin Q, Wu Q, Sun H, et al. Cistrome Data Browser: expanded datasets and new tools for gene regulatory analysis. Nucleic acids research. 2018;47(D1):D729–D735. doi: 10.1093/nar/gky1094

34. Fishburn PC. Interval orders and interval graphs: A study of partially ordered sets. John Wiley & Sons; 1985.

35. Steel M. The complexity of reconstructing trees from qualitative characters and subtrees. Journal of classification. 1992;9(1):91–116. doi: 10.1007/BF02618470

36. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–1313. doi: 10.1093/bioinformatics/btu033 24451623

37. Consortium EP, et al. A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS biology. 2011;9(4):e1001046. doi: 10.1371/journal.pbio.1001046

38. Benayoun BA, Pollina EA, Ucar D, Mahmoudi S, Karra K, Wong ED, et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell. 2014;158(3):673–688. doi: 10.1016/j.cell.2014.06.027 25083876

39. Jolliffe I. Principal component analysis. Springer; 2011.

40. Ferrari KJ, Scelfo A, Jammula S, Cuomo A, Barozzi I, Stützer A, et al. Polycomb-dependent H3K27me1 and H3K27me2 regulate active transcription and enhancer fidelity. Molecular cell. 2014;53(1):49–62. doi: 10.1016/j.molcel.2013.10.030 24289921

41. Liu X, Wang C, Liu W, Li J, Li C, Kou X, et al. Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos. Nature. 2016;537(7621):558. doi: 10.1038/nature19362 27626379

42. Suzuki S, Murakami Y, Takahata S. H3K36 methylation state and associated silencing mechanisms. Transcription. 2017;8(1):26–31. doi: 10.1080/21541264.2016.1246076 27723431

43. Lee JS, Smith E, Shilatifard A. The language of histone crosstalk. Cell. 2010;142(5):682–685. doi: 10.1016/j.cell.2010.08.011 20813257

44. Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proceedings of the National Academy of Sciences. 2010;107(50):21931–21936. doi: 10.1073/pnas.1016071107

45. Roch S, Steel M. Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent. Theoretical population biology. 2015;100:56–62. doi: 10.1016/j.tpb.2014.12.005

46. Kubatko LS, Degnan JH. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst Biol. 2007;56:17. doi: 10.1080/10635150601146041 17366134

47. Edwards SV, Liu L, Pearl DK. High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences. 2007;104(14):5936–5941. doi: 10.1073/pnas.0607004104

48. Leaché AD, Rannala B. The accuracy of species tree estimation under simulation: a comparison of methods. Syst Biol. 2011;60(2):126–137. doi: 10.1093/sysbio/syq073 21088009

49. DeGiorgio M, Degnan JH. Fast and consistent estimation of species trees using supermatrix rooted triples. Molecular biology and evolution. 2009;27(3):552–569. doi: 10.1093/molbev/msp250 19833741

50. Bayzid MS, Warnow T. Naive binning improves phylogenomic analyses. Bioinformatics. 2013;29(18):2277–2284. doi: 10.1093/bioinformatics/btt394 23842808

51. Mirarab S, Bayzid MS, Boussau B, Warnow T. Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science. 2014;346(6215):1250463. doi: 10.1126/science.1250463 25504728

52. Maddison WP. Gene trees in species trees. Systematic Biology. 1997;46:523–536. doi: 10.1093/sysbio/46.3.523

53. Sakai N, Tager AM. Fibrosis of two: Epithelial cell-fibroblast interactions in pulmonary fibrosis. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease. 2013;1832(7):911–921. doi: 10.1016/j.bbadis.2013.03.001

54. Iwano M, Plieth D, Danoff TM, Xue C, Okada H, Neilson EG. Evidence that fibroblasts derive from epithelium during tissue fibrosis. The Journal of clinical investigation. 2002;110(3):341–350. doi: 10.1172/JCI15518 12163453

55. Okada H, Danoff TM, Kalluri R, Neilson EG. Early role of Fsp1 in epithelial-mesenchymal transformation. American Journal of Physiology-Renal Physiology. 1997;273(4):F563–F574. doi: 10.1152/ajprenal.1997.273.4.F563

56. Hay ED. An overview of epithelio-mesenchymal transformation. Cells Tissues Organs. 1995;154(1):8–20. doi: 10.1159/000147748

57. Kalluri R, Weinberg RA. The basics of epithelial-mesenchymal transition. The Journal of clinical investigation. 2009;119(6):1420–1428. doi: 10.1172/JCI39104 19487818

58. Polyak K, Weinberg RA. Transitions between epithelial and mesenchymal states: acquisition of malignant and stem cell traits. Nature Reviews Cancer. 2009;9(4):265. doi: 10.1038/nrc2620 19262571

59. Hugo H, Ackland ML, Blick T, Lawrence MG, Clements JA, Williams ED, et al. Epithelial—mesenchymal and mesenchymal—epithelial transitions in carcinoma progression. Journal of cellular physiology. 2007;213(2):374–383. doi: 10.1002/jcp.21223 17680632

60. Gu X. Understanding tissue expression evolution: from expression phylogeny to phylogenetic network. Briefings in bioinformatics. 2015;17(2):249–254. doi: 10.1093/bib/bbv041 26141828

61. Scherer A. Batch effects and noise in microarray experiments: sources and solutions. vol. 868. John Wiley & Sons; 2009.

62. Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L, et al. Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PloS one. 2011;6(2):e17238. doi: 10.1371/journal.pone.0017238 21386892

Článek vyšel v časopise


2019 Číslo 9

Nejčtenější v tomto čísle

Tomuto tématu se dále věnují…


Zvyšte si kvalifikaci online z pohodlí domova

Antiseptika a prevence ve stomatologii
nový kurz
Autoři: MUDr. Ladislav Korábek, CSc., MBA

Citikolin v neuroprotekci a neuroregeneraci: od výzkumu do klinické praxe nejen očních lékařů
Autoři: MUDr. Petr Výborný, CSc., FEBO

Zánětlivá bolest zad a axiální spondylartritida – Diagnostika a referenční strategie
Autoři: MUDr. Monika Gregová, Ph.D., MUDr. Kristýna Bubová

Diagnostika a léčba deprese pro ambulantní praxi
Autoři: MUDr. Jan Hubeňák, Ph.D

Význam nemocničního alert systému v době SARS-CoV-2
Autoři: doc. MUDr. Helena Lahoda Brodská, Ph.D., prim. MUDr. Václava Adámková

Všechny kurzy
Kurzy Doporučená témata