Backward compatibility of whole genome sequencing data with MLVA typing using a new MLVAtype shiny application for Vibrio cholerae

Autoři: Jérôme Ambroise aff001;  Léonid M. Irenge aff001;  Jean-François Durant aff001;  Bertrand Bearzatto aff001;  Godfrey Bwire aff002;  O. Colin Stine aff003;  Jean-Luc Gala aff001
Působiště autorů: Center for Applied Molecular Technologies, Institute of Clinical and Experimental Research, Université catholique de Louvain, Brussels, Belgium aff001;  Ministry of Health Uganda, Department of Community Health, Kampala, Uganda aff002;  University of Maryland School of Medicine, Department of Epidemiology and Public Health, Baltimore, Maryland, United States of America aff003
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article



Multiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used by laboratory-based surveillance networks for subtyping pathogens causing foodborne and water-borne disease outbreaks. However, Whole Genome Sequencing (WGS) has recently emerged as the new more powerful reference for pathogen subtyping, making a data conversion method necessary which enables the users to compare the MLVA identified by either method. The MLVAType shiny application was designed to extract MLVA profiles of Vibrio cholerae isolates from WGS data while ensuring backward compatibility with traditional MLVA typing methods.


To test and validate the MLVAType algorithm, WGS-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method.


Results obtained by Sanger sequencing and MLVAType were totally concordant. However, the latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly. With a k-mer of 127, less than 15% estimation of V. cholerae VNTR was censored. Preventing censored estimation was only achievable when using a longer k-mer size (i.e. 175), which is not proposed in the SPAdes v.3.13.0 software.


As NGS read lengths and qualities tend to increase with time, one may expect the increase of k-mer size in a near future. Using MLVAType application with a longer k-mer size will then efficiently retrieve MLVA profiles from WGS data while avoiding censored estimation.

Klíčová slova:

Dideoxy DNA sequencing – Genomic libraries – Polymerase chain reaction – Sequence assembly tools – Sequence motif analysis – Tandem repeats – Uganda – Vibrio cholerae


1. Pérez-Losada M, Cabezas P, Castro-Nallar E, Crandall KA. Pathogen typing in the genomics era: MLST and the future of molecular epidemiology. Infection, Genetics and Evolution. 2013;16:38–53. doi: 10.1016/j.meegid.2013.01.009 23357583

2. Inouye M, Dashnow H, Raven LA, Schultz MB, Pope BJ, Tomita T, et al. SRST2: rapid genomic surveillance for public health and hospital microbiology labs. Genome medicine. 2014;6(11):90. doi: 10.1186/s13073-014-0090-6 25422674

3. Deng X, Shariat N, Driebe EM, Roe CC, Tolar B, Trees E, et al. Comparative analysis of subtyping methods against a whole-genome-sequencing standard for Salmonella enterica serotype Enteritidis. Journal of clinical microbiology. 2015;53(1):212–218. doi: 10.1128/JCM.02332-14 25378576

4. Deng X, Den Bakker HC, Hendriksen RS. Applied Genomics of Foodborne Pathogens. Springer; 2017.

5. Nadon C, Van Walle I, Gerner-Smidt P, Campos J, Chinen I, Concepcion-Acevedo J, et al. PulseNet International: Vision for the implementation of whole genome sequencing (WGS) for global food-borne disease surveillance. Eurosurveillance. 2017;22(23). doi: 10.2807/1560-7917.ES.2017.22.23.30544 28662764

6. Vergnaud G, Hauck Y, Christiany D, Daoud B, Pourcel C, Jacques I, et al. Genotypic Expansion within the Population Structure of Classical Brucella Species Revealed by MLVA16 Typing of 1404 Brucella Isolates from Different Animal and Geographic Origins, 1974-2006. Frontiers in microbiology. 2018;9:1545. doi: 10.3389/fmicb.2018.01545 30050522

7. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of computational biology. 2012;19(5):455–477. doi: 10.1089/cmb.2012.0021 22506599

8. Irenge L, Ambroise J, Mitangala P, Bearzatto B, Senga R, Durant JF, et al. Genomic analysis of pathogenic strains of Vibrio cholerae from eastern Democratic Republic of Congo (2014-2017). PLOS Neglected Tropical Diseases. submitted in 2019;.

9. Bwire G, Sack DA, Almeida M, Li S, Voeglein JB, Debes AK, et al. Molecular characterization of Vibrio cholerae responsible for cholera epidemics in Uganda by PCR, MLVA and WGS. PLoS neglected tropical diseases. 2018;12(6):e0006492. doi: 10.1371/journal.pntd.0006492 29864113

10. Kendall EA, Chowdhury F, Begum Y, Khan AI, Li S, Thierer JH, et al. Relatedness of Vibrio cholerae O1/O139 isolates from patients and their household contacts, determined by multilocus variable-number tandem-repeat analysis. Journal of bacteriology. 2010;192(17):4367–4376. doi: 10.1128/JB.00698-10 20585059

11. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29(8):1072–1075. doi: 10.1093/bioinformatics/btt086 23422339

12. Goldstein S, Beka L, Graf J, Klassen JL. Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing. BMC genomics. 2019;20(1):23. doi: 10.1186/s12864-018-5381-7 30626323

13. Hyytia-Trees E, Lafon P, Vauterin P, Ribot EM. Multilaboratory validation study of standardized multiple-locus variable-number tandem repeat analysis protocol for Shiga toxin–producing Escherichia coli O157: a novel approach to normalize fragment size data between capillary electrophoresis platforms. Foodborne pathogens and disease. 2010;7(2):129–136. doi: 10.1089/fpd.2009.0371 19785535

14. Ghosh R, Nair GB, Tang L, Morris JG, Sharma NC, Ballal M, et al. Epidemiological study of Vibrio cholerae using variable number of tandem repeats. FEMS microbiology letters. 2008;288(2):196–201. doi: 10.1111/j.1574-6968.2008.01352.x 18811655

Článek vyšel v časopise


2019 Číslo 12
Nejčtenější tento týden