De novo identification of satellite DNAs in the sequenced genomes of Drosophila virilis and D. americana using the RepeatExplorer and TAREAN pipelines

Autoři: Bráulio S. M. L. Silva aff001;  Pedro Heringer aff001;  Guilherme B. Dias aff001;  Marta Svartman aff001;  Gustavo C. S. Kuhn aff001
Působiště autorů: Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brasil aff001
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: 10.1371/journal.pone.0223466


Satellite DNAs are among the most abundant repetitive DNAs found in eukaryote genomes, where they participate in a variety of biological roles, from being components of important chromosome structures to gene regulation. Experimental methodologies used before the genomic era were insufficient, too laborious and time-consuming to recover the collection of all satDNAs from a genome. Today, the availability of whole sequenced genomes combined with the development of specific bioinformatic tools are expected to foster the identification of virtually all the “satellitome” of a particular species. While whole genome assemblies are important to obtain a global view of genome organization, most of them are incomplete and lack repetitive regions. We applied short-read sequencing and similarity clustering in order to perform a de novo identification of the most abundant satellite families in two Drosophila species from the virilis group: Drosophila virilis and D. americana, using the Tandem Repeat Analyzer (TAREAN) and RepeatExplorer pipelines. These species were chosen because they have been used as models to understand satDNA biology since the early 70’s. We combined the computational approach with data from the literature and chromosome mapping to obtain an overview of the major tandem repeat sequences of these species. The fact that all of the abundant tandem repeats (TRs) we detected were previously identified in the literature allowed us to evaluate the efficiency of TAREAN in correctly identifying true satDNAs. Our results indicate that raw sequencing reads can be efficiently used to detect satDNAs, but that abundant tandem repeats present in dispersed arrays or associated with transposable elements are frequent false positives. We demonstrate that TAREAN with its parent method RepeatExplorer may be used as resources to detect tandem repeats associated with transposable elements and also to reveal families of dispersed tandem repeats.

Klíčová slova:

Clustering algorithms – Drosophila – Drosophila melanogaster – Genome analysis – Invertebrate genomics – Tandem repeats – Transposable elements – Polytene chromosomes


1. de Koning APJ, Gu WJ, Castoe TA, Batzer MA, Pollock DD. Repetitive elements may comprise over two thirds of the human genome. PLoS genetics. 2011;7(12). 22144907

2. Biscotti MA, Olmo E, Heslop-Harrison JS. Repetitive DNA in eukaryotic genomes. Chromosome Research. 2015;23(3):415–20. 26514350

3. Tautz D. Notes on the fefinition and nomenclature of tandemly repetitive DNA sequences. Exs. 1993;67:21–8. 8400689

4. Charlesworth B, Sniegowski P, Stephan W. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature. 1994;371(6494):215–20. 8078581

5. Kuhn GCS, Kuttler H, Moreira O, Heslop-Harrison JS. The 1.688 repetitive DNA of Drosophila: concerted evolution at different genomic scales and association with genes. Mol Biol Evol. 2012;29(1):7–11. 21712468

6. Pavlek M, Gelfand Y, Plohl M, Mestrovic N. Genome-wide analysis of tandem repeats in Tribolium castaneum genome reveals abundant and highly dynamic tandem repeat families with satellite DNA features in euchromatic chromosomal arms. DNA Research. 2015;22(6):387–401. 26428853

7. Rosic S, Kohler F, Erhardt S. Repetitive centromeric satellite RNA is essential for kinetochore formation and cell division (vol 207, pg 335, 2014). J Cell Biol. 2014;207(5):673-. 25365994

8. Kursel LE, Malik HS. The cellular mechanisms and consequences of centromere drive. Curr Opin Cell Biol. 2018;52:58–65. 29454259

9. Bracewell R, Chatla K, Nalley MJ, Bachtrog D. Dynamic turnover of centromeres drives karyotype evolution in Drosophila. BioRxiv [PrePrint]. 2019:733527. [posted 2019 Aug 27] 31524597

10. Kuhn GCS, Sene FM, Moreira-Filho O, Schwarzacher T, Heslop-Harrison JS. Sequence analysis, chromosomal distribution and long-range organization show that rapid turnover of new and old pBuM satellite DNA repeats leads to different patterns of variation in seven species of the Drosophila buzzatii cluster. Chromosome Research. 2008;16(2):307–24. 18266060

11. Plohl M, Meštrović N, Mravinac B. Satellite DNA evolution. Repetitive DNA: Karger Publishers; 2012. p. 126–52.

12. Garrido-Ramos MA. Satellite DNA: an evolving topic. Genes-Basel. 2017;8(9). 28926993

13. Ferree PM, Barbash DA. Species-specific heterochromatin prevents mitotic chromosome segregation to cause hybrid lethality in Drosophila. Plos Biol. 2009;7(10). 19859525

14. Strachan T, Webb D, Dover GA. Transition stages of molecular drive in multiple-copy DNA families in Drosophila. Embo J. 1985;4(7):1701–8. 16453627

15. Bachmann L, Sperlich D. Gradual evolution of a specific satellite DNA family in Drosophila ambigua, D. tristis, and D. obscura. Mol Biol Evol. 1993;10(3):647–59. 8336547

16. Dias GB, Svartman M, Delprat A, Ruiz A, Kuhn GCS. Tetris is a foldback transposon that provided the building blocks for an emerging satellite DNA of Drosophila virilis. Genome Biol Evol. 2014;6(6):1302–13. 24858539

17. Khost DE, Eickbush DG, Larracuente AM. Single-molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster. Genome research. 2017;27(5):709–21. 28373483

18. Wei KHC, Lower SE, Caldas IV, Sless TJS, Barbash DA, Clark AG. Variable rates of simple satellite gains across the Drosophila phylogeny. Mol Biol Evol. 2018;35(4):925–41. 29361128

19. Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nature Reviews Genetics. 2012;13(1):36. 22124482

20. Novak P, Neumann P, Pech J, Steinhaisl J, Macas J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics. 2013;29(6):792–3. 23376349

21. Novak P, Robledillo LA, Koblizkova A, Vrbova I, Neumann P, Macas J. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic acids research. 2017;45(12). 28402514

22. Ruiz-Ruano FJ, Lopez-Leon MD, Cabrero J, Camacho JPM. High-throughput analysis of the satellitome illuminates satellite DNA evolution. Scientific reports. 2016;6. 27385065

23. de Lima LG, Svartman M, Kuhn GCS. Dissecting the satellite DNA landscape in three cactophilic Drosophila sequenced genomes. G3-Genes Genom Genet. 2017;7(8):2831–43. 28659292

24. Palacios-Gimenez OM, Dias GB, de Lima LG, Kuhn GCES, Ramos E, Martins C, et al. High-throughput analysis of the satellitome revealed enormous diversity of satellite DNAs in the neo-Y chromosome of the cricket Eneoptera surinamensis. Scientific reports. 2017;7. 28743997

25. Utsunomia R, Silva DMZD, Ruiz-Ruano FJ, Goes CAG, Melo S, Ramos LPE, et al. Satellitome landscape analysis of Megaleporinus macrocephalus (Teleostei, Anostomidae) reveals intense accumulation of satellite sequences on the heteromorphic sex chromosome. Scientific reports. 2019;9. 30971780

26. Liu Q, Li XY, Zhou XY, Li MZ, Zhang FJ, Schwarzacher T, et al. The repetitive DNA landscape in Avena (Poaceae): chromosome and genome evolution defined by major repeat classes in whole-genome sequence reads. Bmc Plant Biol. 2019;19. 31146681

27. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287(5461):2185–95. Epub 2000/03/25. 10731132

28. Bosco G, Campbell P, Leiva-Neto JT, Markow TA. Analysis of Drosophila species genome size and satellite DNA content reveals significant differences among strains as well as between species. Genetics. 2007;177(3):1277–90. 18039867

29. Miklos G. Localized highly repetitive DNA sequences in vertebrate and invertebrate genomes. Molecular evolutionary genetics. 1985:241–321.

30. Gregory TR, Johnston JS. Genome size diversity in the family Drosophilidae. Heredity. 2008;101(3):228–38. 18523443

31. Craddock EM, Gall JG, Jonas M. Hawaiian Drosophila genomes: size variation and evolutionary expansions. Genetica. 2016;144(1):107–24. Epub 2016/01/23. 26790663

32. Gall JG, Cohen EH, Polan ML. Repetitive DNA sequences in Drosophila. Chromosoma. 1971;33(3):319-+. 5088497

33. Gall JG, Atherton DD. Satellite DNA sequences in Drosophila virilis. J Mol Biol. 1974;85(4):633–64. 4854195

34. Heikkinen E, Launonen V, Muller E, Bachmann L. The pvB370 BamHI satellite DNA family of the Drosophila virilis group and its evolutionary relation to mobile dispersed genetic pDv elements. Journal of molecular evolution. 1995;41(5):604–14. 7490775

35. Biessmann H, Zurovcova M, Yao JG, Lozovskaya E, Walter MF. A telomeric satellite in Drosophila virilis and its sibling species. Chromosoma. 2000;109(6):372–80. 11072792

36. Dias GB, Heringer P, Svartman M, Kuhn GC. Helitrons shaping the genomic architecture of Drosophila: enrichment of DINE-TR1 in alpha and beta-heterochromatin, satellite DNA emergence, and piRNA expression. Chromosome research: an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology. 2015;23(3):597–613. Epub 2015/09/27. 26408292

37. Abdurashitov MA, Gonchar DA, Chernukhin VA, Tomilov VN, Tomilova JE, Schostak NG, et al. Medium-sized tandem repeats represent an abundant component of the Drosophila virilis genome. BMC genomics. 2013;14:771. Epub 2013/11/12. 24209985

38. Garcia G, Rios N, Gutierrez V. Next-generation sequencing detects repetitive elements expansion in giant genomes of annual killifish genus Austrolebias (Cyprinodontiformes, Rivulidae). Genetica. 2015;143(3):353–60. Epub 2015/03/21. 25792372

39. Robledillo LÁ, Koblížková A, Novák P, Böttinger K, Vrbová I, Neumann P, et al. Satellite DNA in Vicia faba is characterized by remarkable diversity in its sequence composition, association with centromeres, and replication timing. Scientific reports. 2018;8(1):5838. 29643436

40. Ugarkovic D, Plohl M. Variation in satellite DNA profiles—causes and effects. Embo J. 2002;21(22):5955–9. 12426367

41. Morales-Hojas R, Reis M, Vieira CP, Vieira J. Resolving the phylogenetic relationships and evolutionary history of the Drosophila virilis group using multilocus data. Mol Phylogenet Evol. 2011;60(2):249–58. 21571080

42. Afgan E, Baker D, van den Beek M, Blankenberg D, Bouvier D, Cech M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic acids research. 2016;44(W1):W3–W10. 27137889

43. Fonseca NA, Morales-Hojas R, Reis M, Rocha H, Vieira CP, Nolte V, et al. Drosophila americana as a model species for comparative studies on the molecular basis of phenotypic variation. Genome Biol Evol. 2013;5(4):661–79. 23493635

44. Baimai V. Chromosomal Polymorphisms of Constitutive Heterochromatin and inversions in Drosophila. Genetics. 1977;85(1):85–93. 838273

45. Ashburner M. Drosophila. A laboratory handbook: Cold spring harbor laboratory press; 1989. ISBN: 0879693215

46. Flynn JM, Long M, Wing RA, Clark AG. Evolutionary dynamics of abundant 7 bp satellites in the genome of Drosophila virilis. BioRxiv [PrePrint]. 2019:693077 [posted 2019 July 4]

47. Gubenko IS, Evgenev MB. Cytological and linkage maps of Drosophila virilis chromosomes. Genetica. 1984;65(2):127–39.

48. Melters DP, Bradnam KR, Young HA, Telis N, May MR, Ruby JG, et al. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome biology. 2013;14(1). 23363705

49. Zelentsova ES, Vashakidze RP, Krayev AS, Evgenev MB. Dispersed repeats in Drosophila virilis: elements mobilized by interspecific hybridization. Chromosoma. 1986;93(6):469–76.

50. Roy V, Monti-Dedieu L, Chaminade N, Siljak-Yakovlev S, Aulard S, Lemeunier F, et al. Evolution of the chromosomal location of rDNA genes in two Drosophila species subgroups: ananassae and melanogaster. Heredity. 2005;94(4):388. 15726113

51. Wei KH, Grenier JK, Barbash DA, Clark AG. Correlated variation and population differentiation in satellite DNA abundance among lines of Drosophila melanogaster. Proc Natl Acad Sci U S A. 2014;111(52):18793–8. 25512552

52. Cohen EH, Bowman SC. Detection and location of three simple sequence DNAs in polytene chromosomes from virilis group species of Drosophila. Chromosoma. 1979;73(3):327–55. Epub 1979/08/01. 510073

53. Mestrovic N, Mravinac B, Pavlek M, Vojvoda-Zeljko T, Satovic E, Plohl M. Structural and functional liaisons between transposable elements and satellite DNAs. Chromosome Research. 2015;23(3):583–96. 26293606

Článek vyšel v časopise


2019 Číslo 12