Chromosomal Copy Number Variation, Selection and Uneven Rates of Recombination Reveal Cryptic Genome Diversity Linked to Pathogenicity

Download PDF České info

Pathogenic fungi constitute a growing threat to both plant and animal species on a global scale. Despite a clonal mode of reproduction dominating the population genetic structure of many fungi, putatively asexual species are known to adapt rapidly when confronted by efforts to control their growth and transmission. However, the mechanisms by which adaptive diversity is generated across a clonal background are often poorly understood. We sequenced a global panel of the emergent amphibian pathogen, Batrachochytrium dendrobatidis (Bd), to high depth and characterized rapidly changing features of its genome that we believe hold the key to the worldwide success of this organism. Our analyses show three processes that contribute to the generation of de novo diversity. Firstly, we show that the majority of wild isolates manifest chromosomal copy number variation that changes over short timescales. Secondly, we show that cryptic recombination occurs within all lineages of Bd, leading to large regions of the genome being in linkage equilibrium, and is preferentially associated with classes of genes of known importance for virulence in other pathosystems. Finally, we show that these classes of genes are under directional selection, and that this has predominantly targeted the Global Panzootic Lineage (BdGPL). Our analyses show that Bd manifests an unusually dynamic genome that may have been shaped by its association with the amphibian host. The rates of variation that we document likely explain the high levels of phenotypic variability that have been reported for Bd, and suggests that the dynamic genome of this pathogen has contributed to its success across multiple biomes and host-species.

Published in the journal: . PLoS Genet 9(8): e32767. doi:10.1371/journal.pgen.1003703
Category: Research Article
doi: https://doi.org/10.1371/journal.pgen.1003703

Summary

Introduction

A diverse cadre of fungi and fungal-like oomycetes have recently taken centre stage as emerging infectious diseases (EIDs) owing to their increasing impact on animals, plants and wider ecosystem health [1]. The widespread emergence of this class of pathogens shows that they are able to successfully adapt to infect diverse hosts and ecological niches, suggesting that their genomes are able to respond rapidly to natural selection [1], [2]. This idea finds widespread support; for example, horizontal transfer of whole chromosomes [3] and accelerated evolution across functional domains in effector genes [4] are associated with rapid host-adaptation and changes in virulence across lineages and species. Maintaining the pool of genetic diversity necessary to respond to selection is facilitated by the ability of fungi to utilise multiple reproductive modes, including cryptic recombination that enables inbreeding, outcrossing, hybridization, and the generation of diversity via parasexual mechanisms [5]. These features are suspected to have contributed to the rise of contemporary fungal EIDs, which play a major role in host population declines across a broad swathe of plant and animal species [1], [6], [7].

In recent years, whole genome sequencing has led to the characterization of novel mechanisms driving dynamic genome structure in microbial eukaryotes. In particular, it is increasingly apparent that pathogenic fungi manifest highly plastic genome architecture in the form of variable numbers of individual chromosomes, known as chromosomal copy-number variation (CCNV) or aneuploidy. This feature has been identified across the fungal phylum Ascomycota, ranging from Botrytis cinerea [8], Histoplasma capsulatum [9], Saccharomyces cerevisiae [10], Candida albicans [11] and the Basidiomycota Cryptococcus neoformans [12], [13], [14]. The mechanism(s) generating chromosomal CCNV in fungi are not yet well understood, but are thought to occur as a consequence of nondisjunction following meiotic or mitotic segregation [15], followed by selection operating to stabilise the chromosomal aneuploidies [13]. Although stress occurring as a consequence of either host response or exposure to antifungal drugs has been linked to a rapid rate of CCNV in Candida [16], it is currently unclear to what extent this contributes to broader rates of CCNV in fungi. However, dynamic numbers of chromosomes could offer routes to potentially advantageous phenotypic changes via several mechanisms such as over expression of virulence-factors [13] or drug efflux pumps [17], the maintenance of diversity through homologous recombination [18], increased rates of mutation and larger effective population sizes [19], or by purging deleterious mutations through non-disjunction during chromosomal segregation [20]. Thus, CCNV likely represents an important, yet uncharacterized, source of de novo variation and adaptive potential in many fungi and other non-model eukaryote microbial pathogens.

A contemporary EID that is gaining substantial notoriety is the aquatic chytrid fungus Batrachochytrium dendrobatidis (Bd), which has so far been identified in over 50 countries worldwide and infecting over 500 species of amphibians [21] (http://www.bd-maps.net). One of the most enigmatic aspects of Bd's population genetic structure has been the low levels of genetic variation identified between globally distributed isolates. However, recent studies have shown the existence of up to five separate lineages [22], [23], [24], one of which is shown to have undergone a worldwide range expansion in the 20th Century. We recently compared the genomic diversity of this ‘Global Panzootic Lineage’ (BdGPL) against that of a separate, distantly related (∼1,000 ybp) lineage that appears to have originated in South Africa (named BdCAPE), using SOLiD sequencing. BdGPL was found to harbour evidence of historical recombination, manifested as patchily distributed heterozygosity, and phylogenetic incongruency across small spatial scales that we hypothesised has resulted from ongoing recombination [22]. Therefore, despite the lack of any known sexual meiotic mechanisms in its life cycle, Bd clearly has a more dynamic genome than a purely clonal, mitotic mode of reproduction would suggest. Here, we describe a new global panel of isolates that were subjected to high-depth Illumina sequencing in order to better understand cryptic genomic features that are associated with the rapid ascendancy of this pathogen.

Results

We sequenced 22 isolates of Bd with a geographical distribution spanning five continents to a high depth (52–195X; Table 1) using the Illumina HiSeq 2000 platform. Sequences are deposited in the NCBI Short Read Archive under the submission accession number SRA058657. These reads were then aligned to a reference sequence assembly for isolate JEL423 [25] using BWA [26] and polymorphisms were identified using BiSCaP [27] (Text S1: Optimization of alignments and SNP calling parameters). In total, we identified 904,000 SNPs, 761,000 bi-allelic heterozygous positions and 95,000 multi-allelic heterozygous positions (Table S1), which were distributed across 425,000 loci. Of those sites, 8,457 were homozygous across all of the 42 sequenced isolates (Illumina dataset presented here, SOLiD dataset presented previously [22]), which when combined resulted in concordant phylogenetic trees (Fig. S3). These phylogenies showed that our new panel of isolates belonged to three (BdGPL, BdCAPE, BdCH) of the five suspected lineages of Bd [22], [23], [24] (Figs. S1, S2, S3) and extended both BdGPL and BdCAPE's known geographic range (BdGPL into Switzerland and Ethiopia and BdCAPE into France). Across the Bd JEL423 genome, 96% was covered by at least four reads in every isolate. Additionally, 65% of the total identified variant sites (275,000; 11.8 Kb¹) were called as either reference or polymorphic in all 22 isolates amounting to >10X the number of ‘covered in all’ polymorphic loci previously found using the ABI SOLiD 3 platform [22], owing to the higher depth of sequencing coverage (Table 1).

**Tab. 1. Samples used and details of alignments.**

Chromosomal Copy Number Variation (CCNV)

Comparing the depth of read coverage over each chromosome using 10 Kb non-overlapping sliding windows revealed CCNV present in isolates belonging to all three lineages of Bd and affecting nine of the largest fifteen supercontigs (Figs. 1 and S4). t-tests on the mean depths across windows compared with those in the largest supercontig confirmed a significant increase in read-depth across 36 supercontigs, and a significant decrease in depth across 25 supercontigs in 18 of the 22 sequenced isolates (Fig. S5). To further verify relative ploidy within an isolate and the order of ploidy-changes, we inferred whether individual bases were ‘evenly’ -⁠ or ‘oddly’-distributed across Illumina reads within a single genome by binning their frequencies into histograms for each chromosome. The expectation here is that a chromosome with an even ploidy will tend towards a 50∶50 distribution across each single SNP, while chromosomes with an odd ploidy will tend towards a 33∶66 or 33∶33∶33 ratio across SNP-calls (Figs. S5, S6). This method identified even -⁠ or odd-ploidies for 92% of the chromosomes tested with >95% bootstrap support (Table S2).

**Fig. 1. Read depth across 22 genomes was normalised by total alignment depth and plotted against location in the genome using a 10 Kb long non-overlapping sliding window.**

Thirteen BdGPL and two BdCAPE isolates had greater numbers of bi-alleles than tri-alleles (corresponding to an even ploidy that most parsimoniously corresponds to diploidy) (Table S2), and six isolates belonging to all three separate lineages that had greater numbers of tri-alleles than bi-alleles (corresponding to an odd ploidy that most parsimoniously corresponds to triploidy). The remaining four isolates (BdGPL JEL423 & MODS27, BdCAPE SA1d & SA4c) had significant p-values showing between 1–3 chromosomes in lower ploidy levels relative to the remaining bi-allelic genome. Over these lower-ploidy chromosomes we observed greater numbers of tri-alleles than bi-alleles and no decrease in heterozygous base-calls (both of which should occur if these chromosomes were haploid). We therefore conclude that these four isolates have tetraploid genomes with the identified losses in read-depth corresponding to chromosomes that have lost a single copy and are now trisomic.

We were able to take advantage of replicate lines of BdCH, which were passaged for 40 generations with and without exposure to skin antimicrobial peptides collected from the water frog Pelophylax esculentus. In these culture lines, the ancestral putatively triploid isolate (BdCH ACON) differentially lost and gained copies of supercontig IV and V respectively when passaged without selection (BdCH CON2A), and gained a copy of supercontig V following treatment with antimicrobial peptide (BdCH APEP), which resulted in a significant reduction in mean growth inhibition (Text S1: In vitro Divergence of Independent Replicate Lines of BdCH; Fig. 2). Due to the fact that most of our isolates exhibiting CCNV were sequenced shortly following isolation from nature without sequential passage, we know that CCNV is occurring frequently in both wild and cultured isolates. The rapidity that these mutations are accumulating across our isolates shows that aneuploidies in Bd are occurring at rates that will generate genome diversity within the timescale of a single host infection.

Chromosome copy number variation was identified across the three <i>Bd</i>CH isolates (ACON and its progenitors CON2A and APEP) following 40 generations in culture with or without the addition of anti-microbial peptides (AMP), respectively. — Fig. 2. Chromosome copy number variation was identified across the three BdCH isolates (ACON and its progenitors CON2A and APEP) following 40 generations in culture with or without the addition of anti-microbial peptides (AMP), respectively.

Recombination

In order to detect the presence and frequency of recombination events we determined the phase of bi-allelic heterozygous polymorphisms (Table S1, Figs. S7, S8, S9). We focused our attention on SNPs that were supported by a high percent of uniquely mapped reads (Table S1) and reads agreeing with the phasing (Fig. S9). By performing pairwise comparisons of shared phased positions between each of our isolates, we found >99% of these sites remained in the same phase for intra-lineage comparisons and >92% for inter-lineage comparisons (Fig. S10). However, we also identified 4,974 haplotypes demonstrating crossovers (Fig. S11) where all four pairwise combinations of bases were observed. Of these, 2,007 occurred at unique positions/loci in the genome. Every pair of isolates that we compared (except between BdGPL isolates MAD (FR) and AUL (FR)) showed at least one haplotype that included an inferred crossover (Fig. S11). This was surprising given many of the isolates share a very recent common ancestor. For instance, we found that two isolates (MODS27 and MODS28) which were recovered from Discoglossus sardus at a single site in Sardinia on a single collection trip and are closely related (Fig. S2) had accumulated three crossovers. This shows that recombinant genotypes can accumulate even within highly-related free-living populations of Bd, a feature of this chytrid's population genetics that was first remarked upon by Morgan et al. in populations of Sierra Nevadan Rana muscosa [28].

The greatest proportion of phased positions demonstrating crossovers were found to occur between the three lineages, demonstrating an accumulation of recombinant haplotypes that scales with time of divergence (Fig. S11). For example, as many as 7.3% of the phased positions revealed crossovers that have accumulated since isolates BdGPL AP15 (IT) and BdCH ACON (CH) were separated. Crossovers were identified in every major chromosome, and predominantly identified in intergenic regions (143 Mb⁻¹ compared with 57 Mb⁻¹ for coding regions and 65 Mb⁻¹ for introns) (Figs. 3A, S12, S13).

Crossovers were detected with pairwise comparisons for each <i>Bd</i> isolate across every supercontig. — **Fig. 3. Crossovers were detected with pairwise comparisons for each Bd isolate across every supercontig.**

Crossovers were also found to occur with a higher frequency amongst isolates belonging to the lineages BdCAPE and BdCH (between 0.6 and 1.1% of phased positions, respectively) compared with 0–0.2% in BdGPL. This was a surprising finding given that the three BdCH isolates were separated by only 40 passages in the lab and were derived from a single isolate that had been relatively recently isolated in 2007. This suggests two hypotheses: Either in vitro passage under selective conditions promotes rapid recombination, or our isolate of BdCH is descended from a population of Bd that is more recombinogenic than BdGPL. To further study the amount of recombination within lineages and between isolates, we extracted haplotypes that were phased across all of the isolates within a given lineage and contained at least two alleles per loci (ranging in length from 11 nt to 33.3 Kb: Fig. S12). Because only 35 haplotypes were retained for the entire panel of BdGPL isolates, we also extracted haplotypes from two BdGPL subsets consisting of 3 and 5 isolates respectively, thus allowing higher numbers of crossovers to be retained. From each of these sets of haplotypes, we calculated a multilocus measure of linkage disequilibrium (the standardised index of association rBarD [29]) and applied Hudson's four-gamete test [30] in order to quantify the amount of recombination amongst isolates within each lineage (Table 2). Across the BdGPL groups, >30% of phased positions were in significant disequilibrium compared with 16% and 11% for BdCH and BdCAPE respectively. RbarD appeared to be robust against sample size differences, and gave values from BdGPL values of 0.79–0.82 compared against 0.58 and 0.61 for BdCH and BdCAPE. Finally, a smaller proportion of BdGPL subset haplotypes failed the four-gamete test compared with BdCAPE or BdCH isolates. Each of these findings shows that recombination is causing diversity within each of the lineages. However, the emergent BdGPL is far more clonal than either of the other two lineages.

Haplotypes from isolates belonging to each of the separate lineages were tested for linkage disequilibrium using the index of association (I<sub>A</sub>), rBarD and the 4-gamete test. — **Tab. 2. Haplotypes from isolates belonging to each of the separate lineages were tested for linkage disequilibrium using the index of association (I_A), rBarD and the 4-gamete test.**

We next investigated whether recombination had occurred between these three lineages since their divergence, by calculating θ, Weir's [31] formulation of Wright's fixation index (F_ST) for pairwise comparisons of each lineage across window lengths of 1.4 Kb and 10 Kb (Figs. S14 and 3B). We found that all three lineages were highly differentiated from one another across each chromosome, with only minor intra-chromosomal regions of high similarity (which mainly comprised a long stretch of rDNA located at the start of chromosome 14). This indicates that recombination amongst these lineages has not occurred since their separation. We then determined whether certain categories of genes were associated with higher-than-average rates of recombination using t-tests on numbers of crossovers after accounting for differing levels of heterozygosity and density of phased-sequences (Text S1: Identifying gene groups and names; Table S3, Figs. 4 and S15, S16). Surprisingly, we found only one group showing significant enrichment for crossovers: those showing homology to the C-terminal of the Crinkler (CRN) family of oomycete effector proteins found in the Phytophthora genus [32], [33]. Enrichment was found in both BdGPL and BdCAPE, whilst not in BdCH. Haplotypes that failed the four-gamete test were predominantly from coding-regions, but had no clear pattern of enrichment for any gene category (Table S4).

Fig. 4. Boxplots for eight non-overlapping gene categories comprising every gene were compared for ratios of non-synonymous to synonymous mutations for each of the three lineages (dN/dS) and numbers of crossovers per phased positions (PP) within each gene (≥2PP) for all isolates (outliers omitted for both).

Patterns of mutation and selection

To identify genes that are present in the reference sequence and absent in our panel of isolates (presence/absence polymorphism), we examined the read-depth across each of the genes. Only five genes were identified from our panel (Table S5), including three amongst BdCAPE isolates and two amongst BdCH isolates. Therefore, whilst high-levels of aneuploidy are occurring, it does not appear to be resulting in frequent gene loss. To study the patterns of mutation across the nuclear genome, we categorized each of the mutations by their location in the genome in terms of coding regions (CDS), introns and intergenic regions (Table S6). In every isolate we sequenced, every variant type was found in greater abundance per kilobase in the non-coding regions (with the exception of 0.01 Kb⁻¹ fewer heterozygous positions in the introns compared with the CDS for isolate MG1). This overall pattern can be explained through selection purging deleterious mutations from the CDS. In addition, we found homozygous polymorphisms to be highly supported in all lineages in terms of uniquely mapped reads, whilst un-phased bi-allelic heterozygous positions had a smaller total proportion in the divergent lineages compared with BdGPL, suggesting some heterozygous positions may be miscalled due to paralogs.

We categorised each of the mutations within the CDS into synonymous and non-synonymous mutations (Table S6). SNPs were responsible for 169,000 synonymous changes and 197,000 non-synonymous changes. Genes with putative roles in pathogenicity were grouped by searching for secretion signals, protease domains and carbohydrate binding domains (Text S1: Identifying gene groups and names), and tested each of these for enrichment of homozygous SNPs (Tables S7 and S8) and heterozygous positions (Table S9) using hypergeometric tests. We found that gene groups that carried a secretion signal (proteases, chitin-binding and uncharacterized secreted) as well as CRN-like genes, were significantly enriched for both homozygous and heterozygous polymorphisms relative to the whole set of genes. Predicted chitin-binding proteins that lacked a secretion peptide were not enriched for either homozygous SNPs or heterozygous positions (Tables S7 and S9), and non-secreted proteases were only enriched for synonymous amino acid changes. Conversely, CRN-like genes are only enriched for non-synonymous homozygous SNPs and not synonymous SNPs.

We next measured the rates of synonymous substitution (dS), non-synonymous substitution (dN) and omega (dN/dS = ω) for every gene in every isolate and compared values by grouping isolates into their lineages (Fig. 4, Table S10). In total, we identified 1,450 genes with ω≥1 in at least one of our isolates (BdCAPE = 816; BdCH = 746; BdGPL = 283), suggesting positive or diversifying selection. Although no clear pattern could be distinguished within BdGPL (Fig. S17) owing to the high degree of relatedness amongst isolates and thus relative paucity of polymorphism, CRN-like genes in both BdCAPE and BdCH had the greatest median, upper quartile and upper tail values of omega (Fig. S18). In addition, average ω values for secreted chitin-associated genes and proteases were marginally higher than their non-secreted counter parts. Uncharacterized secreted genes also had a greater ω than either of those non-secreted gene groups. Finally, a significant enrichment of both CRN-like genes and uncharacterized (secreted) genes with ω≥1 were identified in both BdCAPE and BdCH (Table S11).

By analysing each of these 1,450 genes with ω≥1using branch site models (BSM) in PAML along each of the three lineages of Bd, we identified a subset of 482 genes that show evidence for positive selection in at least one of the lineages. For BdCAPE and BdCH, a greater percent of each of the secreted gene categories were found to have accumulated an excess of non-synonymous mutations compared with their non-secreted counterpart gene categories (Table S11). Nine genes were also identified in all three lineages (Fig. S19), including four uncharacterized secreted and five uncharacterised non-secreted genes. However, the most striking finding of this analysis was found among BdGPL isolates where 349/482 (72%) of the genes showed a signature of positive selection compared with only 23% for each of the other two lineages. This finding suggests that BdGPL has been undergoing greater levels of positive selection than either BdCAPE or BdCH, despite the low numbers of sites under selection owing to the high levels of relatedness within this lineage.

Discussion

Recent studies have attributed aspects of Bd's pathogenesis to the presence of a number of putative virulence factors that include proteases and chitin-binding proteins [32], [33], [34]. The former category contain M36 or S41 domains that are thought to degrade host-cellular components, and these protease families are known to have undergone extensive expansions in Bd since its divergence from free-living saprobes such as Homolaphlyctis polyrhiza [32]. Chitin binding proteins are thought to be involved in pathogenesis by allowing Bd to bind to keratinized host cells and to subsequently enter the host cells [34]. To date, the functional nature of the crinkler-like family in Bd has only been inferred owing to their homology to host-translocated proteins of known virulence in oomycetes [33]. Our data show that, across this global panel of 22 isolates and three lineages, the secretome and crinkler-like family of genes manifest higher diversity of homozygous and heterozygous SNPs, enrichment for non-synonymous mutations and greater dN/dS (ϖ) ratios when compared against classes of genes that do not contain a signal peptide. This shows that these gene families are evolving most rapidly in Bd, and that gene-products that interact with the amphibian host are undergoing diversifying (or reduced purifying) selection when compared with those gene-products that remain intracellular. Our findings suggest that Bd has had an evolutionary association with amphibians that predates the radiation of the lineages that we have characterised here, and is further evidence that this chytrid has an obligate rather than an opportunistic association with its amphibian hosts.

By mapping read-depth and SNPs across these genomes, we discovered that widespread genomic variation occurs within and amongst Bd isolates from the level of SNPs up to heterogeneity in ploidy amongst genomes and amongst chromosomes within a single genome. Individuals from all three lineages harboured CCNV along with predominantly or even entirely diploid, triploid and tetraploid genomes. Recent research by Rosenblum et al. [35] has also identified widespread CCNV across diverse lineages of Bd recovered largely from infected amphibians in the Americas, including a single haploid chromosome in isolate BdGPL JEL289. This variation may itself, reflect only part of the full diversity in Bd,pathogensas +2/+3 shifts in ploidy, whole genomes in tetraploid, or chromosomes in pentaploid or greater, may occur and await discovery. Chromosomal genotype was shown to be highly plastic as significant changes in CCNV occurred in as few as 40 generations in culture. It is not known whether other chytrid species also undergo CCNV, or if this is a unique feature of Bd and hence may be intrinsic to its parasitic mode of life. Currently, CCNV is known to occur in a variety of protist microbial pathogens, including fungi, however it is currently not known whether this genomic-feature is specific to a parasitic life-style, or is a more general feature of eukaryote microbes; identifying the ubiquity of CCNV or otherwise across nonpathogenic species will therefore be of great interest. Further, the manner in which the plasticity of CCNV in Bd affects patterns of global transcription and hence the phenotype of each isolate also remains to be studied. However, it is clear from research on yeast, Candida and Cryptococcus, that CCNV significantly contributes to generating altered transcriptomic profiles, phenotypic diversity and rates of adaptive evolution even in the face of quantifiable costs; understanding the relationship between CCNV and Bd-phenotype will therefore likely be key to understanding its patterns of evolution at both micro -⁠ and macro-scales.

Whilst differing numbers of individual chromosomes presents a potential barrier to the standard model of meiosis, homologous recombination may still be occurring via mitotic processes within compatible genomes. In order to study recombination amongst our isolates, we determined the phase of our reads and constructed haplotypes that were suitable for traditional population genetic tests. This showed that, whilst the majority of the genomes from all three lineages manifest widespread linkage disequilibrium, recombination could still be detected across each chromosome and in all genomes. Crossovers (measured both as the proportion of SNPs that change phase and the numbers of haplotypes failing Hudson's four-gamete test) were found to occur much more frequently within the BdCAPE and BdCH lineages compared to BdGPL, and these two lineages accordingly manifest lower average linkage disequilibrium. All of the BdCH genomes that we sequenced stem from a single isolate collected in 2007. This suggests that either the high rates seen here have accrued since the isolate was taken into culture (suggesting a very rapid rate of in vitro recombination), or that we are characterising recombination events that occurred prior to the isolation of BdCH and are segregating as a consequence of the multiple-ploidy nature of Bd. In support of the latter hypothesis, comparisons between population-level data for BdGPL and BdCAPE show that BdGPL is far less recombinogenic and has been undergoing a largely clonal expansion since its emergence, consistent with previous observations made by James et al. [36]. These data suggests that the global BdGPL population is derived from a less recombinogenic ancestor than either BdCH or BdCAPE, that contemporary recombination is not occurring at a rapid rate and, where it occurs, is the result of a selfing rather than outcrossing events.

The discovery of a lower proportion of variable sites across haplotypes in addition to the lower proportion of heterozygous positions in BdGPL compared against BdCAPE or BdCH does not support the notion that BdGPL is an outbred hybrid lineage as previously proposed [22]. The discovery of a new Bd lineage found in Brazil (BdBrazil) along with an isolate that is a likely BdGPL/BdBrazil recombinant [24] strongly implies that Bd retains the ability to outcross, despite having a primarily clonal genome and life cycle. However, values of F_ST across our dataset show no introgression between the three lineages; this demonstrates that they have remained largely separate since their divergence and suggests that outcrossing between lineages of Bd is rare or, if it has occurs, remains spatially restricted. Further broad-scale collections of isolates and extension of our comparative-population genomic analyses will allow the assignment of more accurate rates of introgression across evolutionary timescales.

We show that rates of recombination are uneven across the genome, with CRN-like genes enriched for crossovers, suggests that either CRN-like genes might have features that favour recombination or that recombinants of these genes have a fitness advantage and are thus more likely to reach fixation than recombinants at other locations in the genome. CRNs were also enriched for non-synonymous polymorphisms, are characterised by a signal of directional selection, and are amongst the most polymorphic genes in Bd's genome. Within the oomycete genus Phytophthora, CRNs manifest diverse carboxy-terminal domains and high rates of homologous recombination targeted to the conserved HVLVXXP motif, suggesting that the mosaic domains of CRNs are being shuffled by recombination [2]. Recently, a number of Bd CRNs have been shown to be highly expressed on host tissue in vitro [37]. Therefore, whilst these genes in Bd lack a secretion signal, their expression, accumulation of genetic variation in terms of recombination and ϖ values, and similarities with oomycete CRNs strongly suggest that a number of these CRN-like genes are functional in Bd. However, whether they contribute directly or indirectly to the virulence of Bd remains to be determined.

Our demonstration of multiple hierarchies of cryptic genomic variation in Bd in terms of CCNV, ubiquitous and potentially targeted recombination, and natural selection, points to an ability to generate diversity without the necessity of an obligate sexual stage. Our study has uncovered high levels of genotypic plasticity that are likely to cause widespread phenotypic plasticity even without the need to invoke outcrossing. These large and small-scale changes are therefore likely to contribute to rapid evolutionary rates in the face of an effective host response. Such ‘genomic instability’ may explain the diverse phenotypic responses observed in Bd [38], and may also explain the enormous diversity of hosts and biomes that this generalist pathogen has managed to infect.

Materials and Methods

Full details are given in Text S1, Supplemental Materials and Methods. Briefly, twenty-two isolates that had been collected from nine countries and four continents were chosen for sequencing (Table 1). Paired-end Libraries were constructed according to the protocols provided by Illumina sequencing (Truseq kit). The genome sequence and feature file for the chytrid fungus Batrachochytrium dendrobatidis (Bd) strain JEL423 was downloaded from http://www.broadinstitute.org/ (GenBank project accession number AATT00000000). The feature file for JEL423 had all but the longest splice variants removed for each gene leaving 8794/8819 genes. We aligned our reads to the genome sequence using Burrows-Wheeler Aligner (BWA) v0.5.9 [26] with default parameters, converted to Samtools mpileup format using SAMtools v.0.1.18 [39] and polymorphisms called using the Binomial SNP-Caller from Pileup (BiSCaP) v0.11 [27]. For phylogenetic analysis we extracted polymorphisms covered ≥4 reads in all 22 isolates. FASTA files were converted into Nexus files and trees constructed using the Un-weighted Pair Group Method with Arithmetic Mean (UPGMA) algorithm in PAUP and visualised using Figtree [40] (Fig. S3). Gene groups were identified using gene-annotations, blastx searches (1e⁻⁰⁵ e-value cut-off) to the non-redundant BLAST database, SignalP3.0 [41], Merops [42] and Procarb604 v1 [43].

Chromosome copy number variation (CCNV) was identified using changes in both depth of coverage and percent of reads specifying two most frequent alleles at any locus. To quantify these changes, we first performed t-tests (with a cut-off of p<5⁻¹⁰) on the mean depths across the largest supercontig (supercontig 1) against each subsequent supercontig for each isolate (Fig. S5). Next, we calculated the percent of reads specifying the two most frequent alleles (Fig. S7) for each chromosome in each isolate separately using a minimum depth of 4 reads for both alleles and binned values falling between 47–53% (expected even ploidy/bi-allelic) and 30–36% and 63–69% (expected odd ploidy/tri-allelic). To account for depth and mutation variation within a chromosome, we performed 1000 bootstraps for either predominance of bi-allelic or tri-allelic peaks (Table S2). Using a 5% cut-off (5%<x<95%) we found 305/330 largest 15 chromosomes gave confident odd or even allelic peaks and was largely concordant with changes predicted by t-tests.

To detect recombination, we identified haplotypes using reads that overlapped two or more bi-allelic heterozygous positions. Haplotypes from each isolate were then compared to haplotypes in other isolates. We also calculated the Index of association (I_A), detecting linkage disequilibrium for a given set of haplotypes if VD>L (L_old). We also calculated rBarD values and performed 4 gamete tests between every combination of loci in a haplotype (Fig. S9) to quantify the amount of recombination occurring within populations. In addition, we applied Weir's [30] estimator of Wright's Fixation Index (F_ST) according to the equations given in Multilocus 1.3 [29].

For selection, we used the yn00 and codeml programs of PAML [44] implementing the Yang and Nielsen method [45] on every gene in every isolate and those with ω≥1 respectively. For codeml, we used the Branch site model (BSM) A (model = 2, NSsites = 2, fix_omega = 0) compared with the null model (model = 2, NSsites = 2, fix_omega = 1, omega = 1). Next, we calculated 2 * the log likelihood difference between the two compared models (2D′) with two degrees of freedom, and identified any with values greater than 8.1887 and 11.4076 (5% and 1% significance after Bonferroni correction). Enrichment for crossovers and polymorphisms was detected using hypergeometric tests and t-tests.

For in vitro divergence, an isolate of B. dendrobatidis from a Swiss Alytes obstetricans (isolate 0739) was subcultured into control (ACON) and peptide-treated (APEP) culture flasks containing 10 ml 1% tryptone media supplemented with 1% penicillin-streptomycin (Sigma) to reduce the risk of bacterial contamination. Cultures were incubated at 18°C and passaged every 4–5 d by scraping the side of the flask and transferring 1 ml into 9 ml fresh media. Peptide-treatment included addition to the media of 80 µg ml-1 skin defense peptides collected from Pelophylax esculentus (n = 15 combined) according to Daum et al. (2012) [46]. This was equivalent to the IC₅₀, or the concentration at which growth of Bd was inhibited by 50%.

Supporting Information

Zdroje

1. FisherMC, HenkDA, BriggsCJ, BrownsteinJS, MadoffLC, et al. (2012) Emerging fungal threats to animal, plant and ecosystem health. Nature 484 : 186–194.

2. RaffaeleR, KamounS (2012) Genome evolution in filamentous plant pathogens: why bigger can be better. Nat Rev Microbiol 10 : 417–430.

3. MaL, van der DoesHC, BorkovichKA, ColemanJJ, DaboussiM, et al. (2010) Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium. Nature 464 : 367–73.

4. WinJ, MorganW, BosJ, KrasilevaKV, CanoLM, et al. (2007) Adaptive evolution has targeted the C-terminal domain of the RXLR effectors of plant pathogenic Oomycetes. Plant Cell 19 : 2349–2369.

5. HeitmanJ (2010) Evolution of eukaryotic microbial pathogens via covert sexual reproduction. Cell Host Microbe 8 : 86–99.

6. WittenbergAHJ, van der LeeTAJ, M'BarekSB, WareSB, GoodwinSB, et al. (2009) Meiosis drives extraordinary genome plasticity in the haploid fungal plant pathogen Mycosphaerella graminicola. PLoS ONE 4: e5863.

7. CookeDEL, CanoLM, RaffaeleS, BainRA, CookeLR, et al. (2012) Genome analyses of an aggressive and invasive lineage of the Irish potato famine pathogen. PLoS Pathog 8: e1002940.

8. BüttnerP, KochF, VoigtK, QuiddeT, RischS, et al. (1994) Variations in ploidy among isolates of Botrytis cinerea: implications for genetic and molecular analyses. Curr Genet 25 : 445–50.

9. CarrJ, ShearerGJr (1998) Genome size, complexity, and ploidy of the pathogenic fungus Histoplasma capsulatum. J Bacteriol 180 : 6697–6703.

10. SheltzerJM, BlankHM, PfauSJ, TangeY, GeorgeBM, et al. (2011) Aneuploidy drives genomic instability in Yeast. Science 333 : 1026–30.

11. AbbeyD, HickmanM, GreshamD, BermanJ (2011) High-resolution SNP/CGH microarrays reveal the accumulation of loss of heterozygosity in commonly used Candida albicans strains. G3: Genes Genomes Genetics 1 : 523–30.

12. LengelerKB, CoxGM, HeitmanJ (2001) Serotype AD strains of Cryptococcus neoformans are diploid or aneuploid and are heterozygous at the mating-type locus. Infect Immun 69 : 115–22.

13. HuG, WangJ, ChoiJ, JungWH, LiuI, et al. (2011) Variation in chromosome copy number influences the virulence of Cryptococcus neoformans and occurs in isolates from AIDS patients. BMC Genomics 12 : 526.

14. SionovE, LeeH, ChangYC, Kwon-ChungKJ (2010) Cryptococcus neoformans overcomes stress of azole drugs by formation of disomy in specific multiple chromosomes. PLoS Pathog 6: e1000848.

15. ReedyJL, FloydAM, HeitmanJ (2009) Mechanistic plasticity of sexual reproduction and meiosis in the Candida pathogenic species complex. Curr Biol 19 : 891–9.

16. ForcheA, MageePT, SelmeckiA, BermanJ, MayG (2009) Evolution in Candida albicans populations during a single passage through a mouse host. Genetics 182 : 799–811.

17. Kwon-ChungKJ, ChangYC (2012) Aneuploidy and drug resistance in pathogenic fungi. PLoS Path 8: e1003022.

18. ForcheA, AlbyK, SchaeferD, JohnsonAD, BermanJ, et al. (2008) The parasexual cycle in Candida albicans provides an alternative pathway to meiosis for the formation of recombinant strains. PLoS Biol 6: e110.

19. ArnoldB, BombliesK, WakeleyJ (2012) Extending coalescent theory to autotetraploids. Genetics 192 : 195–204.

20. SchoustraSE, DebetsAJM, SlakhorstM, HoekstraRF (2007) Mitotic recombination accelerates adaptation in the fungus Aspergillus nidulans. PLoS Genet 3: e68.

21. FisherMC, GarnerTWJ, WalkerSF (2009) Global emergence of Batrachochytrium dendrobatidis and amphibian chytridiomycosis in space, time, and host. Annu Rev of Microbiol 63 : 291–310.

22. FarrerRA, WeinertLA, BielbyJ, GarnerTWJ, BallouxF, et al. (2011) Multiple emergence of genetically diverse amphibian-infecting chytrids include a globalised hypervirulent lineage. Proc Natl Acad Sci U S A 108 : 18732–6.

23. GokaK, YokoyamaJ, UneY, KurokiT, SuzukiK, et al. (2009) Amphibian chytridiomycosis in Japan: Distribution, haplotypes and possible route of entry into Japan. Mol Ecol 18 : 4757–74.

24. SchloegelLM, ToledoLF, LongcoreJE, GreenspanSE, VieiraCA, et al. (2012) Novel, panzootic and hybrid genotypes of amphibian chytridiomycosis associated with the bullfrog trade. Mol Ecol 21 : 5162–5177.

25. Batrachochytrium dendrobatidis Sequencing Project, Broad Institute of Harvard and MIT (http://www.broadinstitute.org/).

26. LiH, DurbinR (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25 : 1754–60.

27. FarrerRA, HenkDA, MacLeanD, StudholmeDJ, FisherMC (2013) Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 3 : 1512.

28. MorganJAT, VredenburgVT, RachowiczLJ, KnappRA, SticeMJ, et al. (2007) Population genetics of the frog-killing fungus Batrachochytrium dendrobatidis. Proc Natl Acad Sci USA 104 : 13845–50.

29. AgapowP, BurtA (2001) Indices of multilocus linkage disequilibrium. Mol Ecol Notes 1 : 101–2.

30. HudsonR, KaplanN (1985) Statistical properties of the number of recombination events in the history of a sample of sequences. Genetics 111 : 147–164.

31. Weir BS (1996) Genetic Data Analysis II. Sinauer, Sunderland.

32. JonesonS, StajichJE, ShiuSH, RosenblumEB (2011) Genomic transition to pathogenicity in chytrid fungi. PLoS Pathog 7: e1002338.

33. SunG, YangZ, KoschT, SummersK, HuangJ (2011) Evidence for acquisition of virulence effectors in pathogenic chytrids. BMC Evol Biol 11 : 195.

34. AbramyanJ, StajichJE (2012) Species-specific chitin binding module 18 (CBM18) expansion in the amphibian pathogen Batrachochytrium dendrobatidis. mBio 3: e00150–12.

35. RosenblumEB, JamesTY, ZamudioKR, PoortenTJ, RodriguezD, et al. (2013) Complex history of the amphibian-killing chytrid fungus revealed with genome resequencing data. Proc Natl Acad Sci U S A 110 : 9385–9390.

36. JamesTY, LitvintsevaAP, VilgalysR, MorganJAT, TaylorJW, et al. (2009) Rapid global expansion of the fungal disease chytridiomycosis into declining and healthy amphibian populations. PLoS Pathog 5 : 1–12.

37. RosenblumEB, PoortenTJ, JonesonS, SettlesM (2012) Substrate-specific gene expression in Batrachochytrium dendrobatidis, the chytrid pathogen of amphibians. PLoS ONE 7: e49924.

38. WoodhamsDC, AlfordRA, BriggsCJ, JohnsonM, Rollins-SmithLA (2008) Life-history trade-offs influence disease in changing climates: strategies of an amphibian pathogen. Ecology 89 : 1627–39.

39. LiH, HandsakerB, WysokerA, FennellT, RuanJ, et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25 : 2078–9.

40. DrummondAJ, RambautA (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7 : 214.

41. BendtsenJD, NielsenH, HeijneGV, BrunakS (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340 : 783–795.

42. RawlingsND, BarrettAJ, BatemanA (2010) MEROPS: the peptidase database. Nucleic Acids Res 38: D227–D233.

43. MalikA, FirozA, JhaV, AhmadS (2010) PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools. Adv Bioinformatics 436036.

44. YangZ (2007) PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol Biol Evol 24 : 1586–91.

45. YangZ, NielsenR (2000) Estimating Synonymous and Nonsynonymous Substitution Rates Under Realistic Evolutionary Models. Mol Biol Evol 17 : 32–43.

46. DaumJM, DavisLR, BiglerL, WoodhamsDC (2012) Hybrid advantage in skin peptide immune defenses of water frogs (Pelophylax esculentus) at risk from emerging pathogens. Infect Genet Evol 12 : 1854–1864.

47. BergerL, SpeareR, DaszakP, GreenDE, CunninghamAA, et al. (1998) Chytridiomycosis causes amphibian mortality associated with population declines in the rain forests of Australia and central America. Proc Natl Acad Sci U S A 95 : 9031–6.