Convergent Evolution During Local Adaptation to Patchy Landscapes

Download PDF České info

Often, a large species range will include patches where the species differs because it has adapted to locally differing conditions. For instance, rock pocket mice are often found with a coat color that matches the rocks they live in, these color differences are controlled genetically, and mice that don’t match the local rock color are more likely to be eaten by predators. Sometimes, similar genetic changes have occurred independently in different patches, suggesting that there were few accessible ways to evolve the locally adaptive form. However, the genetic basis could also be shared if migrants carry the locally beneficial genotypes between nearby patches, despite being at a disadvantage between the patches. We use a mathematical model of random migration to determine how quickly adaptation is expected to occur through new mutation and through migration from other patches, and study in more detail what we would expect successful migrations between patches to look like. The results are useful for determining whether similar adaptations in different locations are likely to have the same genetic basis or not, and more generally in understanding how species adapt to patchy, heterogeneous landscapes.

Published in the journal: . PLoS Genet 11(11): e32767. doi:10.1371/journal.pgen.1005630
Category: Research Article
doi: https://doi.org/10.1371/journal.pgen.1005630

Summary

Introduction

The convergent evolution of similar phenotypes in response to similar selection pressures is a testament to the power that selection has to sculpt phenotypic variation. In some cases this convergence extends to the molecular level, with disparate taxa converging to the same phenotype through parallel genetic changes in the same pathway, genes, or even by precisely the same genetic changes [1–3]. Convergent adaptation also occurs within species, if different individuals adapt to the same environment through different genetic changes. There are a growing number of examples of this in a range of well studied organisms and phenotypes [4]. Such evolution of convergent phenotypes is favored by higher mutational input, i.e., higher total mutational rate and/or population size [5]. The geographic distribution of populations can also affect the probability of parallel mutation within a species: a widespread species is more likely to adapt by multiple, parallel mutations if dispersal is geographically limited, since subpopulations will adapt via new mutations before the adaptive allele arrives via migration [6]. Standing variation for a trait can also increase the probability of convergence, as this increases the probability that the selective sweep will be soft (beginning from a base of multiple copies), which leads to genetic patterns similar to convergent adaptation [7, 8] whether or not the copies derive from the same mutation,

Intuitively, convergence is also more likely when geographically separated populations adapt to ecologically similar conditions. The probability that convergent adaptations arise independently before adaptations spread between the populations by migration will be larger if these adaptive alleles are maladapted in intervening environments, since such adverse conditions can strongly restrict the spread of locally adapted alleles [9].

One elegant set of such examples is provided by the assortment of plant species that have repeatedly adapted to patches of soil with high concentrations of heavy metals (e.g., serpentine outcrops and mine tailings) [10–13]; the alleles conferring heavy metal tolerance are thought to be locally restricted because they incur a cost off of these patches. Similarly, across the American Southwest, a variety of species of animals have developed locally adaptive cryptic coloration to particular substrates, e.g., dark rock outcrops or white sand dunes [14]. One of the best-known examples is the rock pocket mouse (Chaetodipus intermedius), which on a number of black lava flows has much darker pelage than on intervening areas of lighter rock [15]. Strong predator-mediated selection appears to favour such crypsis [16], and, perhaps as a result of this strong selection against migrants, at least two distinct genetic changes are responsible for the dark pigmentation found on different outcrops [17]. Similar situations have been demonstrated in other small rodent systems [18–20] and in lizards [21].

In this paper, we study this situation, namely, when a set of alleles provide an adaptive benefit in geographically localized patches, separated by inhabited regions where the alleles are deleterious. The main questions are: Under what conditions is it likely that each patch adapts in parallel, i.e., convergently through new mutation, and when is it likely that migration carries these alleles between patches? How easy will it be to detect adaptive alleles that are shared by migration, i.e., over what genomic scale will a population genetic signal be visible?

We work in a model of continuous geography, using a variety of known results and new methods. In the section Establishment of a locally adaptive allele due to mutational influx we develop a simple approximation, Eq (2), for the rate at which patches become established by new mutations. The most important conceptual work of the paper occurs in the section Establishment of a locally adaptive allele due to migrational influx, where we develop an understanding of the process by which alleles move from an existing patch to colonize a novel patch, culminating in Eq (11) for the rate at which this happens. We combine these two results in the section The probability of parallel adaptation between patches to discuss the probability of parallel adaptation, Eq (12). To understand the genomic signal left by adaptations shared by migration, in the section Length of the hitchhiking haplotype, we study the time it will take for an allele to transit between patches, (Eq (18)), and thus the length of haplotype that hitchhikes with it (Eq (19)). Finally, in the section Applications we apply this work to understand the geographic scale of convergent evolution in Chaetodipus intermedius.

Results

Model of a patchy landscape

Consider a population spread across a landscape to which it is generally well adapted, but within which are patches of habitat to which individuals are (initially) poorly adapted. (When we refer to “patches” it is to these pieces of poor habitat.) Suppose it takes only a single mutational change to create an allele (B) that adapts an individual to the poor habitat type. The required change could be as specific as a single base change, or as general as a knockout of a gene or one of a set of genes. This change occurs at a (low) rate of μ per chromosome per generation, and has fitness advantage s_p relative to the unmutated type (b) in these “poor” habitat patches. Outside of these patches the new allele has a fitness disadvantage of s_m in the intervening areas, with s_p and s_m both positive. (Here we define “fitness” to be the intrinsic growth rate of the type when rare.) We assume that the disadvantage s_m is sufficiently large that, on the relevant timescale, the allele is very unlikely to fix locally in these intervening regions. (The case where s_m = 0 requires a different approach, which we do not treat here.)

We are interested in the establishment of mutations in the “poor” patches by either migration or mutation, which depends on whether the allele can escape initial loss by drift when rare. Therefore, we need not specify the fitness of the homozygote; only that the dynamics of the allele when rare are determined by the fitness of the heterozygote. More general dominance will only make small corrections to the dynamics until initial fixation, with the exception of the recessive case, which we omit. In other words, we follow the literature in treating the diploid model as essentially haploid.

We also assume population density ρ is constant across the range (even in the “poor” patches). The variance in offspring number is ξ², and that the mean squared distance between parent and child is σ² (i.e., σ is the dispersal distance). We will deal with migration by immediately appealing to the central limit theorem, treating the total distance traveled across t dispersal events as Gaussian with variance tσ², and do not discuss the (interesting) cases where rare, long-distance dispersal events are more important (see e.g., [6, 22, 23] for discussion).

Past work on spatial models

We will make use of several previous results from the literature on migration in models of spatially varying selection. [24], [25], and [9] first analyzed deterministic models of of this sort. [26] and [27] showed that in one dimension, if the physical width of the patch is less than ( σ / 2 s p ) tan -⁠ 1 ( s m / s p ), then there is no stable equilibrium with both b and B present, so that migrational swamping prevents B from establishing (see also [28] for a review). [29], using general theory of [30], showed that patches must be at least this critical size so that a new mutation has positive probability of establishment (in the large population density limit). The same paper also showed that this probability of establishment of a new mutation decays exponentially with distance from the patch, with the scale given by σ / 2 s m. Mutations appearing within the patch have probability of establishment strictly less than the probability for a panmictic population, but approaching this as the size of the patch increases. This latter panmictic probability of establishment we denote p_e, and often approximate p_e by 2s_p/ξ² [31, 32]. This result holds quite generally for a geographically spread population that experiences a uniform selection pressure [33, 34]. More general work on the mathematically equivalent problem of density-dependent population growth has obtained much more general criteria under which a habitat configuration will support a stable polymorphic equilibrium [35], and determined the habitat configuration that maximizes the probability of establishment [36].

Establishment of a locally adaptive allele due to mutational influx

We first compute the time scale on which a new B mutations will appear and fix in a single, isolated poor habitat patch in which no B allele has yet become established. As we are interested in patches where local adaptation can occur, we will assume that the patch is larger than the cutoff for local establishment mentioned above.

Let p(x) be the probability that a new mutant B allele that arises at location x relative to the center of the patch fixes within the poor habitat patch. Under various assumptions, precise expressions can be found for p(x) [29], but results will be more interpretable if we proceed with a simple approximation. The total influx per generation of mutations that successfully establish is the product of population density ρ, mutation rate μ, and the integral of p(x) over the entire species range:

This depends in a complicated way on the patch geometry and selection coefficients, but still scales linearly with the mutational influx density ρμ. If we consider a patch of area A, whose width is large relative to σ / 2 s m, then a reasonable approximation is to ignore migration, so that p(x) = p_e ≈ 2s_p/ξ² within the patch, and p(x) = 0 otherwise. This approximates the integrand p(x) in Eq (1) by a step function, which will be a good approximation if the patch is large relative the scale over which p(x) goes from 0 to p_e, or if p(x) is close to p_e at some point and is symmetric about the edge of the patch. We examine this more generally via exact calculation of p(x) in the section Numerical calculation of the probability of establishment.

The rate at which mutations arise and colonize a patch of area A is then

i.e., the product of mutational influx in the patch (ρAμ) and the probability of establishment (p_e ≈ 2s_p/ξ²). If this rate is low, then the time (in generations) until a mutation arises that will become locally established within the patch is exponentially distributed with mean 1/λ_mut. Assuming that once a mutation becomes established it quickly reaches its equilibrium frequency across the patch, the time scale on which new patches become colonized by the B allele from new mutation is therefore approximately 1/λ_mut.

Establishment of a locally adaptive allele due to migrational influx

Now suppose that there are two patches separated by distance R (i.e., the shortest distance between the two is R). If the B allele has arisen and become established in the first patch, but has not yet appeared in the second, we would like to know the time scale on which copies of B move between patches by migration (supposing that no B allele arises independently by mutation in the meantime). To determine this, we study the fine-scale genealogy of alleles that transit between patches, obtaining along the way other useful information about the genealogy of B alleles. Doing this we arrive at Eq (11) for the rate at which an allele established in one patch spreads to a neighboring one by migration.

Migration–selection balance ensures that there will be some B alleles present in the regions where they are disadvantageous, but only rarely, far away from the patch where B is established. Denote the expected frequency of allele B at displacement x relative to the patch by q(x), and assume that the shape of the patch is at least roughly circular. Following [24] or [9], one can write down a differential equation to which q(x) is the solution, and show that q(x) decays exponentially for large ∣x∣, with a square-root correction in two dimensions:

where d is the dimension (d = 1 or 2), and C is a constant depending on the geographic shape of the populations and the selection coefficients. In applications we fit C to the data; to get concrete numbers from theory we take C = 1 if necessary.

As J. Hermisson pointed out in comments on an earlier draft, this functional form has a simple intuition: local migration leads to the exponential decay, since if each migrant at distance x produces an average of c descendants that make it to distance x + 1, then in one dimension, the number of migrants must decay as exp(−cx); and the square-root term in two dimensions enters because the available area grows with x. The “renewal” aspect of this same argument suggests that for Eq (3) to hold, ∣x∣ must be larger than a few multiples of σ.

These asymptotics fit simulations quite well, as shown in Fig 1. To be clear about the assumptions implicit here, below we provide a derivation of Eq (3) in section The equilibrium frequency, as well as justification for the asymptotics below in section Asymptotic solution for the equilibrium frequency. In one dimension, the equation can be solved to give q(x) in terms of Jacobi elliptic functions [37]; see the Supporting Information.

**Fig. 1. Simulated migration-selection balance.**

This expected frequency q(x) is the time-averaged occupation frequency, i.e., the total number of B alleles found across T generations near location x, per unit area, divided by T. If, for instance, q(x) = .01 and the population density is ρ = 10 individuals per unit area, then in a survey tract of unit area around x we expect to find one individual every 10 generations—or, perhaps more likely, 10 individuals every 100 generations. This points to an important fact: close to the original patch, the “equilibrium frequency” q(x) describes well the density of the B allele at most times, but far away from the patch, the equilibrium is composed of rare excursions of families of B alleles, interspersed by long periods of absence. An example of one of these rare excursions is shown in Fig 2. The relevant time scale on which B alleles migrate between patches is given by the rate of appearance of such families.

A segment of the temporal dynamics of the same simulation as <em class="ref">Fig 1B</em>. — **Fig. 2. A segment of the temporal dynamics of the same simulation as *Fig 1B*.**

This suggests decomposing the genealogy of B alleles into families in the following way. First, consider the genealogy of all B alleles that were alive at any time outside of the original patch. This is a collection of trees, each rooted at an allele living outside the patch whose parent was born inside the patch. Next, fix some intermediate distance r₀ from the established patch, and erase from the genealogy every allele that has no ancestors living further away than r₀ to the patch. This fragments the original set of trees into a collection of smaller trees that relate to each other all B alleles living outside of r₀ at any time, and some living inside of r₀; we refer to these trees as “families”. If r₀ is large enough that there are not too many families and interactions between family members are weak, then these families can be approximated by a collection of independent spatial branching processes whose members are ignored if they enter the original patch, illustrated in Fig 3. (This can be made formal in the limit of large population density, also taking r₀ large enough that the chance of reaching the original patch is small.) The opportunity for adaptation depends on how often these families of B alleles encounter the new patch. Suppose that the area occupied by the new patch is S. We can learn about the rate of arrival of families at S from the time-averaged number of B alleles expected in S were it not a patch (i.e., if B was still maladaptive in S), which from Eq (3) is

where q(S) = ∫_S q(y)dy. The quantity we wish to compute, the effective rate at which families of migrant B alleles establish in the new patch, is

The quantities on the right-hand side depend implicitly on r₀, but their product does not.

**Fig. 3. Cartoon of the decomposition described in the text.**

The genealogy of migrant families

To understand the process of adaptation by migration, we therefore need to look more closely at the rare families of B alleles far from the original patch. It is reasonable to model each such family as a branching process, i.e., assuming that each individual migrates, then reproduces, independently of all other family members. Migration takes the individual in a random direction for a random distance whose mean square is σ², and if the individual arrives to a patch where the B allele is favored, we drop the individual from the family. Since the B allele is uniformly deleterious elsewhere, we can decouple reproduction and migration by first forming a tree in which each individual has a random number of offspring, then assigning spatial locations to each individual in the tree, and finally pruning any branches that wander into a patch. This pruning removes both branches leading back into the already established patch (which are irrelevant) and branches leading into the new patch (we will need to count these). Since we are concerned with the rare migrant families that transit quickly between patches, we ignore back migrants to the already established patch.

Since each individual has on average e^−s_m offspring, the expected family size after time t is e^−ts_m. Therefore, if we define k_e(t) to be the chance that the family has gone extinct by time t, and let K_t be the (random) family size conditioned on nonextinction, then by Bayes’ rule, e -⁠ s m t = ( 1 -⁠ k e ( t ) ) E [ K t ]. It turns out that if the distribution of offspring numbers is not too heavy-tailed (see [38] for details), then K_t has a limiting distribution: K t → d K, and

In other words, the chance of survival to time t is asymptotically a constant multiplied by e^−s_mt, and conditional on survival, the family size is given by a fixed distribution.

This suggests that families can be well-described by motion of a central “trunk”, which dies at a random time whose distribution is given by k_e(t), surrounded by “branches”, as depicted in Fig 3. We can understand this with a decomposition described by [39]. When we condition on survival until t, we condition on survival of at least one trunk lineage (the long red lineage in Fig 3). Given this lineage, the other reproduction events are undistinguished—the genealogy of a family that survives until t can be decomposed into a trunk that lasts until t and that sprouts independent, unconditioned branches that may or may not survive until t. Concretely, if we write Z_t for the number of offspring in the branching process alive at time t, then

where each Z^(s,k) is an independent copy of Z that arose from the main branch in generation s, and each W_s is independent with distribution that tends to P { W s = k } ≃ P { W ≥ k } / E [ W ] as s → ∞, where W is the unconditioned offspring number distribution. In particular, E [ W s -⁠ 1 ] ≃ E [ W ( W -⁠ 1 ) ] / E [ W ]; if W is Poisson then this is equal to e^−s_m. Since each subfamily Z^(s,k) is a branching process with mean offspring number less than one, most of these die out quickly, and so Z_t is composed of a cluster of recently split subfamilies. If we take the expectation of Eq (7), since E [ Z t ] = e -⁠ s m t, we get the limiting mean family size conditioned on nonextinction:

Subfamilies typically live for 1/s_m generations, thus spreading across an area of width σ / s m. The oldest subfamily in a surviving migrant family is of order log(1/s_m)/s_m generations, so the entire family is spread slightly wider.

Heuristics

The parallel form suggested by Eqs (4) and (5) suggests that we can find λ_mig(S) by multiplying the mean density q(S) by the ratio of the second terms in the two equations, i.e.

where again the “occupation density” is computed without the selective advantage. To make this precise we need to understand what we mean by “outflux of families”; we do this below in section Hitting and occupation, and provide a heuristic argument here.

Both numerator and denominator in Eq (9) count only families that have arrived at the new patch, so we can consider a family conditioned on having successfully transited between the patches. To obtain simply interpretable expressions, we assume that once any member of the family arrives at the patch, all other members are close by, and so have roughly equal opportunity to establish in the new patch.

Suppose the family arrives with K individuals. The probability that at least one individual leaves a lineage that establishes in the patch is 1 −⁠ (1 −⁠ p_e)^K. On the other hand, in the absence of the patch of new habitat, each of the K newly arrived individuals would leave behind a lineage with mean total size ∑ t = 0 ∞ e -⁠ t s m ≈ 1 / s m, roughly half of which would stay in the new patch. Therefore, the expected contribution of the family to the occupation density is K/2s_m. (For more careful consideration of the geometry of the patch and the distinction between occupation density and number of individuals, see section Hitting and occupation, below.)

This suggests that

There are now two extremes to consider. If P { K > 1 / p e } is small, then each family has a small chance of establishment, and the numerator is approximately p e E [ K ]. On the other hand, if p_e is not too small, then the numerator would be close to 1, as any family that arrives will likely establish.

The remaining term is q(S), the integral of Eq (3) over the patch. Since the patches are at distance R, if the new patch has width smaller than σ / 2 s m, then q(S) is approximately q(R) multiplied by the area of the new patch. If S is larger, then we should use the area of the closest strip of width σ / 2 s m to the original patch; see Fig 3, and section The integral q(S) for more details. In either case, denote this area A′; for numerical examples we compute q(S) directly.

Combining these, and absorbing the term E [ W ( W -⁠ 1 ) ] / E [ W ] from Eq (8) into the constant C,

where A′ is the area of the new patch no further than R + σ / 2 s m from the old patch. When we use this below, we set p_e = 2s_p/ξ². Note that the approximation has the undesirable property that it is non-monotonic in s_m, when we expect the rate of adaptation via migration to be a decreasing function of s_m. However, it only increases for small s_m, and is a decreasing function of s_m for s_m > ∣R∣²/2σ², which is the parameter region that we are restricted to by other assumptions.

A review, and a test, of the assumptions

Now that we have expressions for the mean rates of adaptation by new mutation, Eq (2), and by migration from an already colonized patch, Eq (11), it seems helpful to step back and review the assumptions underlying the asymptotic results we have used, or will use below. Our results should hold exactly in the limit of large, circular patches sufficiently far apart, large population density, and small selection coefficients of equal magnitude.

To be more precise, the mutation rate results most obviously apply if s_m ≈ s_p, if the patch diameter is large relative to σ / min ( s m , s p ), and the patch is not too far from circular. If s_m ≪ s_p then a strip of width σ / s m should be added to the outside of the patch in computing A for Eq (2), and if s_m ≫ s_p, a strip of width σ / s p should be subtracted from the patch. As for the migration rate, we assume that each patch is large enough to support a stable population of B alleles (A 1 / d > ( σ / 2 s p ) tan -⁠ 1 ( s m / s p )). The geometry and size of the patch will also affect the approximation of Eq (48). Next, in using Eq (3), we assume that the inter-patch distance is large relative to the characteristic length (R > σ / s m), and that local drift is not so strong that the equilibrium frequency is at least approximately attained (using Wright’s effective neighborhood size, s_m ≫ 1/(ρσ^d)). The last requirement is necessary because if the B allele fixes in large neighborhoods where it is deleterious, we cannot approximate its dynamics via a branching process. We also neglect the time for migration-selection equilibrium to be reached. As discussed above, we also assume that migration to a new patch takes place over a number of generations; if there are sufficient rare, long-distance migration events that would move between patches in a single hop, this would require a different analysis.

To test the robustness of our results to the various approximations we used, we used individual-based simulations on a one-dimensional lattice of 501 demes, with ρ haploid individuals per deme, dispersal to nearby demes with σ = 0.95, and run for 25,000 generations. More details are given below in section Simulation methods, and the number of simulations used and parameter values are given in supplementary S1 and S2 Tables. To estimate the mean time until adaptation by mutation, we used one centrally located patch of 99 demes and a mutation rate of μ = 10⁻⁵, and to estimate the mean time until adaptation by migration, we used two centrally located patches of 99 demes separated by a varying number of demes, with one patch initialized to have the B allele at frequency 0.8. In each case, we measured the “time to adaptation” as the amount of time until at least 100 B alleles were present in the previously unadapted patch. Fig 4 summarizes how the results compare to theory, excluding parameter combinations that violate the assumptions discussed above, or where a majority of simulations did not adapt by 25,000 generations. S1 and S2 Figs show all times, and depictions of typical simulation runs are shown in S3, S4, S5, S6, S7, S8, S9 and S10 Figs.

**Fig. 4. Observed versus predicted time until adaptation.**

The agreement is reasonable for both. The theoretical value 1/λ_mut underestimates the mean time to mutation by a factor of around 2 that increases slightly with s_m. This is to be expected for two reasons: First, we compute the time to reach 100 B alleles, while theory predicts the time until the progenitor of those 100 B alleles arose. Second, the expression for λ_mut neglects effects near the boundary of the patch, and the larger s_m is, the harder it is for mutations that arise near the edge of the patch to establish. The theoretical values 1/λ_mig again has a margin of error of a factor of about 2. S10 Fig shows simulations at more parameter values.

The probability of parallel adaptation between patches

We now turn to the question of whether the separated patches adapt by parallel genetic changes or by the spread of a migrant allele between the patches. As only a single mutation is required for individuals to adapt to the poor habitat patch, subsequent mutations that arise after an allele becomes established in the patch gain no selective benefit. Similarly, an allele introduced into a patch by migration will not be favored by selection to spread, if the patch has already been colonized. Therefore, mutations selectively exclude each other from patches, over short time scales, and will only slowly mix together over longer time scales by drift and migration, an assumption we also made in [6]. In reality, different mutations will likely not be truly selectively equivalent, in which case this mixing occurs over a time-scale dictated by the difference in selection coefficients.

We assume that once a B allele is introduced (by migration or mutation) it becomes established in the poor habitat patch rapidly if it escapes loss by demographic stochasticity. Under this approximation, and the “selective exclusion” just discussed, after one of the patches becomes colonized by a single B allele, the other patch will become colonized by migration or mutation, but not by both. As such, the question of how the second patch adapts simply comes down to whether a new mutation or a migrant allele is the first to become established in the second patch. To work out how adaptation is likely to proceed, we can compare Expressions (2) and (11) above for the effective rates of adaptation by new mutation and by migration. We work in one dimension, as the square root term appearing for two dimensions is relatively unimportant.

We first consider the order of magnitude that our parameters need to be on in order for adaptation via mutation or migration to dominate. Let γ = min(1, s_m/p_e) and w = A/A′—since A is the area of the not-yet-adapted patch and A′ is the area of its closest σ / 2 s m strip, w is approximately the width of the patch in units of σ / 2 s m. Effective migration and mutation rates will be on the same order if A μ ≈ 2 A ′ γ s m exp ( -⁠ R 2 s m / σ ), where R is the distance between the patches. In other words, the migrational analogue of “mutational influx” (μρA) is 2 ρ A ′ γ s m exp ( -⁠ R 2 s m / σ ), which depends fairly strongly on the selective disadvantage s_m between patches. Equivalently, the rates are roughly equal if R / σ = log ( 2 γ s m / ( w μ ) ) / 2 s m, which gives the critical gap size past which adaptation will be mostly parallel in terms of selection coefficients, patch width, and mutation rate.

If we take μ = 10⁻⁵, the patch width to be ten times the length scale σ / 2 s m so w = 10 (and −log(wμ) ≈ 9.2), and s_m > p_e so that γ = 1, then adaptation is mostly parallel for patches separated by gaps larger than R / σ > ( 9 . 2 + log ( 2 s m ) ) / 2 s m. If the selective pressure is fairly strong (say, s_m = .05), then for convergence to be likely, the distance between patches must be at least 21.8 times the dispersal distance (i.e., R/σ > 21.8). If s_m is much smaller, say s_m = .001, the required distance increases to R/σ > 67.

If the mutation rate is higher—say, μ = 10⁻³—the required separation between patches is only reduced to R/σ > 7.3 with s_m = .05. If s_m = .001, then with this higher mutational influx this calculation predicts that mutation will always be faster than migration—however, this should be taken with caution, since as discussed above, this model does not hold if R is of the same order as σ or if s_m is small enough that local drift is more important.

We can go beyond these rough calculations to find the probability of parallel adaptation if we are willing to take our approximations at face value. Doing so, the time until the first of the two patches is colonized by the B allele will be approximately exponentially distributed with mean 1/(2λ_mut). Following this, the time until the second patch is subsequently colonized (via either migration or new mutation) will be exponentially distributed with mean 1/(λ_mut + λ_mig). Both these rates scale linearly with population density (ρ) and the probability of establishment of a rare allele (p_e ≈ 2s_b/ξ², for p_e > s_m), so that the overall time to adaptation is robustly predicted to increase with offspring variance (ξ²) and decrease with population density and beneficial selection coefficient. Furthermore, the probability that the second adaptation is a new mutation, i.e., the probability of parallel adaptation, with γ = min(1, s_m/p_e), is

so that the probability of parallel mutation should increase approximately logistically with the distance between the patches, on the spatial scale σ / 2 s m.

We tested this using the same simulations as Fig 4, by using the empirical distributions of the respective times to adaptation to estimate the probability, for each parameter combination, that a new, successful mutation appears before a successful migrant arrives from another, already adapted patch. The results are compared to Eq (12) in Fig 5.

Probability that two patches evolve by independent mutations (i.e., convergently), as a function of their geographic separation (<i>R</i>) and the strength of selection between patches (<i>s</i><sub><i>m</i></sub>). — **Fig. 5. Probability that two patches evolve by independent mutations (i.e., convergently), as a function of their geographic separation (R) and the strength of selection between patches (s_m).**

Multiple patches

Suppose that instead of patches there are many, each patch adapting through either migration or mutation. If the times between successive establishments are the result of many nearly-independent attempts with small probability of success, the process of occupation is well approximated by a Markov chain, whose state records the labeling of patches according to the colonizing allele (or none, for those that haven’t yet adapted).

If not yet adapted, a focal patch of area A will acquire the adaptation by new mutation at rate 2s_p μρA/ξ² (Eq (2)). Suppose that patches 1, …, k are already adapted, and that these patches are at distances R₁, …, R_k away from this unadapted patch. The total rate of adaptation through migration, from Eq (11) if patches do not interfere with each other, is

If the patch adapts through migration, the probability the adapted allele comes from patch i is

Since the compound parameter s_pρ is common to both migration and mutation rates, it functions only as a time scaling of the process, and therefore has no effect on the final configuration, namely, the sets of patches sharing the genetic basis of their adaptive response. We can also rescale time by the typical patch size, and introduce a parameter, say a k = A k / A ¯, making the properties (other than the time scaling) of the discrete model independent of the numerical sizes of the demes themselves. This is complementary to the results of [5], who showed that multiple mutations are likely to arise within a panmictic deme if the population-scaled mutation rate 2Nμ is greater than 1.

We can say more if we suppose that all patches have the same area and are the same distance from each other (i.e., the “island model”). The resulting process is a continuous-time version of the “Chinese restaurant process” described in [40], and so the final partition of types across demes has the Ewens sampling distribution with parameter w μ / ( 4 s m exp ( -⁠ 2 s m R / σ ) ). Now we can compute most properties we might want about the process. For instance, the proportion of demes that shares the same origin as a randomly sampled deme is approximately Beta distributed—see [41]. The high connectedness of the discrete deme island model means that the expected number of distinct alleles grows with the log of the number of demes. This strongly contrasts with the continuous spatial model where the local nature of dispersal means that doubling the species range will double the number of mutations expected.

Length of the hitchhiking haplotype

If a patch adapts through new mutation or a rare migrant lineage, the genomic region surrounding the selected site will hitchhike along with it [42], so that at least initially, all adapted individuals within a patch share a fairly long stretch of haplotype. Pairs of adapted individuals within a patch will initially share a haplotype of mean genetic length of about log(ρAs_p)/s_p around the selected site, as long as the patch is reasonably well mixed by dispersal (otherwise see [43]). This association gets slowly whittled down by recombination during interbreeding at the edge of the patch, but there will always be longer LD nearby to the selected site [44].

When an already adapted patch colonizes another through migration, the newly colonized patch will inherit a long piece of haplotype around the selected site from the originating patch. The genetic length of this haplotype is roughly inversely proportional to the number of generations the allele spends “in transit”, because while very rare, the allele will be mostly in heterozygotes, and so each generation provides another opportunity for a different haplotype to recombine closer to the transiting B allele. A large, linked haplotype may still arrive and fix in the new patch, in which case the haplotype has literally hitchhiked across geography! Fig 6 shows a simulation of such an event, including the lineage that founds an adaptive population on a second patch, and the length of the haplotype shared between this lineage and one in the original patch. The time that the lineage is outside the region where the B allele is common (dark red in Fig 6), the haplotype that accompanies it is broken down rapidly. After the lineage establishes on the patch, the rate of decay of the haplotype is slowed significantly, since most others with which it recombines have similar haplotypic backgrounds.

**Fig. 6. Simulation of adaptation by migration.**

Above we argued that a good model for this transit is a single Brownian trunk lineage surrounded by a cloud of close relatives of random size K whose chance of surviving until t generations is 1 −⁠ k_e(t). Consider a single such family, and let τ be the (random) time at which it first hits the new patch, with τ = ∞ if it never does. We assume that the length of the hitchhiking segment depends only on the time spent by the trunk lineage of the first successful migrant “in transit” between the patches, i.e., τ conditioned on τ < ∞. In so doing, we neglect recombination between B alleles (since they are at low density in transit), and the possibility that more than one successful migrant family is in transit at once (so that faster migrants would be more likely to have arrived first).

Since each generation spent in transit provides an opportunity for recombination, if recombination is Poisson, the length of the haplotype (in Morgans) initially shared between populations on each side of the selected locus is exponentially distributed with mean τ. Therefore, if L is the length of hitchhiking segment on, say, the right of the selected locus, then

Computing this depends on 1 −⁠ k_e(t), the probability a family survives until t. Since 1 -⁠ k e ( t ) ≃ e -⁠ s m t / E [ K ], we approximate the lifetime distribution of a family by an exponential with mean 1/s_m. We can then use standard results on hitting times of d-dimensional Brownian motion that is killed at rate s_m (see [45] 2.2.0.1 and 4.2.0.1). In particular, if the patch is circular with radius w and lies at distance R from the already adapted patch, then

where K₀ is a modified Bessel function of the second kind. We are interested in lineages that manage to reach the patch before being killed, i.e., having τ < ∞, which occurs with probability P { τ < ∞ } = lim ℓ → 0 E [ exp ( -⁠ ℓ τ ) ].

To keep the expressions simple, in the remainder of this section we only compute quantities for one dimension. By Bayes’ rule,

This can be differentiated to find that the expected transit time is

and that Var [ τ ∣ τ < ∞ ] = ( R 2 s m / σ ) × 1 / ( 2 s m ) 2. (A saddle point approximation provides an alternative route to these expressions.)

The form of Eq (17) implies that if Y is an exponential random variable with rate R 2 / σ, then L has the same distribution as ( Y + s m ) 2 -⁠ s m. Furthermore, the expected length of shared hitchhiking haplotype is

and Var [ L ] = 2 s m σ 2 / R 2 + 4 σ 3 2 s m / R 3 + 5 σ 4 / R 4. For two dimensions, asymptotics of Bessel functions show that L has the same tail behavior, but how other properties might change is unclear.

As a rule of thumb, Eq (18) means that families who successfully establish move at speed 2 s m σ towards the new patch—if s_m, the strength of the selection against them, is smaller, the need to move quickly is less imperative. Then, Eq (19) means that the haplotype length is roughly the length one would expect given the mean transit time, so more weakly deleterious transiting alleles arrive with them shorter haplotypes. Note that we have become used to seeing σ divided by 2 s m, rather than multiplied; it appears here because σ / 2 s m is a length scale, and to convert it to a speed this is divided by 1/2s_m.

Applications

Coat color in the rock pocket mouse Chaetodipus intermedius is a classic example of local adaptation [14, 15]. These mice live on rocky outcrops throughout the southwestern United States and northern Mexico, and generally have a light pelage similar to the predominant rock color. However, in some regions these mice live on darker substrates (e.g., old lava flows), and in these patches have much darker pigmentation, presumably to avoid visual predators. Some of the largest patches of dark rock range from 10km to 100km wide and lie 50–400km from each other, and dark-colored populations of C. intermedius have been found on many of these. (However, patches of all sizes occur across all scales in a heterogeneous manner.) [46] demonstrated that on one of these flows (Pinacate), much of the change in coloration is due to an allele at the MC1R locus. This dark allele differs from the light allele by four amino acid changes, and has a dominant or partially dominant effect depending on the measure of coat color. The Pinacate allele is not present in a number of other populations with dark pelage, suggesting these populations have adapted in parallel [17, 46]. However, [47] reasoned that, elsewhere in the range, multiple close dark outcrops may share a dark phenotype whose genetic basis has been spread by migration despite intervening light habitat.

A key parameter above was the dispersal distance divided by the square root of strength of selection against the focal allele between patches, σ / s m. [48] studied the frequency of the dark MC1R allele and coat color phenotypes, at sites across the (dark) Pinacate lava flow and at two nearby sites in light-colored rock. On the lava flow the dark MC1R allele is at 86% frequency, while at a site with light substrate 12km to the west (Tule), the frequency is 3%. The dark allele was entirely absent from Christmas pass, a site with light substrate 7km north of Tule, and 3km further from the lava flow. In the other direction, the dark MC1R allele was at 34% at a site with light substrate 10km to the east of the flow (O’Neill). Note that such apparent asymmetry is expected, since as noted above the migration–selection equilibrium can be highly stochastic. These numbers give us a sense of a plausible range of the parameters. Assuming the dark allele is at 50% frequency at the edge of the lava flow, we can fit Formula (3) to these frequencies (similar to [48]). Doing this for Tule we obtain σ / s m ≈ 3 km, and for O’Neill, σ / s m ≈ 30 km, giving us a range of cline widths. We also need estimate of s_m. Using σ ≈ 1km [49, 50], these cline widths imply that s_m = 1/9 and 1/900 are reasonable values.

The mutational target size μ for the trait is unclear. While the Pinacate dark haplotype differs from the light haplotype at four amino acid residues, it is likely that not all of these changes are needed for a population to begin to adapt. Also, there a number of genes besides MC1R at which adaptive changes affecting pigmentation have been identified in closely related species and more broadly across vertebrates [51]. To span a range of plausible values, we use a low mutation rate of μ = 10⁻⁸ (a single base pair), and a high mutation rate μ = 10⁻⁵ (a kilobase). Finally, we set A = 100km² (roughly the size of the Pinacate patch). In Fig 7 we show the dependence of the probability of parallel mutation on the distance between lava flow patches using these parameters, showing that parallel mutation should become likely over a scale of tens to a few hundred kilometers between patches.

**Fig. 7. Results for the rock pocket mouse.**

Given the large selection coefficient associated with the dark allele on the dark substrate, we expect the initial haplotype associated with either a new mutation or migrant allele to be large. Fig 7 also shows how long the founding haplotype shared between populations is expected to be, from Eq (19). The initial length can be quite long between geographically close patches (tens of kilometers). However, for the wider cline width (σ / s m = 30 km), adaptation by migration can still be likely for patches 100km apart, but the shared basis may be hard to detect, as the length of shared haplotype can be quite short.

Discussion

This paper is an investigation into the basic question: What is the spatial resolution of convergent local adaptation? In other words, over what spatial scale of environmental patchiness will the process of adaptation develop independent solutions to evolutionary problems? The answer to this depends most strongly on σ / s m, the dispersal distance divided by the square root of the strength of selection against the allele between patches. It depends much more weakly on the selective benefit within the patches or (perhaps surprisingly) the population density, although these two factors will determine the time-scale over which adaptation will occur (and note that population density could affect the selection coefficients). This is in contrast with models of panmictic populations [8, 52, 53] and geographically spread populations adapting to homogeneous selection pressures [6], where the probability of multiple, independently arising adaptive alleles increases with the population size. However, in all of these models the dependence on the beneficial selection coefficient is absent or weak, due to the fact that selection both aids establishment of new alleles and the spread of existing alleles (but see [53] for the complications of varying population sizes).

We have also shown that while weaker selection against alleles will make sharing of adaptations between patches easier, it also makes it harder to spot such sharing, since the lucky alleles that manage to colonize new patches move slower, and thus carry a shorter shared haplotype. This issue is amplified by the fact that the length of haplotype shared within patches decays over time, potentially making the identification of shared adaptations to old selection pressures difficult.

Perhaps the most useful rule-of-thumb quantities we found were the following. The effective rate of migration into an as-yet-unadapted patch from an already-adapted patch distance R away—the analogue of the mutational influx μρA—is 2 ρ A ′ s m exp ( -⁠ R 2 s m / σ ). Equivalently, the critical gap size between patches past which adaptation is likely independent is R = ( σ / 2 s m ) log ( 2 s m / ( w μ ) ). Finally, successfully transiting migrant lineages move at rate σ 2 s m, and so shared haplotype lengths between patches will be of order R / ( σ 2 s m ).

In developing the set of approximations in the paper we have ignored a number of complicating factors. We now briefly discuss these.

Standing variation

We have focused on relative rates of adaptation, since in applications where adaptation has occurred, the question is whether adaptations in distinct patches have appeared independently or not. However, any adaptation that does occur may have to make use of standing variation, if mutation rates are low. The case of a panmictic population was studied by [8], and we study the case of a continuous, spatial population in [54]. If parallelism in local adaptation of the sort we study here is due to standing variation rather than new mutation, then the dynamics of adaptation should not depend strongly on migration patterns (but the initial spatial distribution of standing variation may).

Dominance

We have mostly ignored the issue of dominance by dealing with essentially haploid models, and appealing to the fact that the dynamics we study occur where the mutation is rare, and hence mostly present only in heterozygotes. Our results should hold as a good approximation to dominant and partially dominant alleles (with s_m the selection against heterozygotes). If, however, the mutation is recessive, then it is essentially neutral where rare, and so would encounter much less resistance to spreading between patches. The shape of the cline obtained is given by [24]. This is counteracted, however, by the increased difficulty with which the mutation would establish once arriving in a new patch, if the beneficial effect is also recessive. As such it is not clear what our intuition should be about the contribution of recessive alleles to adaptation via migration. Further work is needed to put empirical observations of local adaptation by recessive alleles in a theoretical context. It is similarly unclear how the model should extend to a polygenic trait.

Decay of shared haplotypes

To provide context for the results on shared haplotype length in section Length of the hitchhiking haplotype it is important to also understand the process by which haplotypes are whittled down within patches. The initial haplotype that sweeps within a patch will be dispersed over time by recombination. Likewise, the haplotype that is shared between patches coadapted by migration will also break down (Eq 19). However, a long time after the initial sweep, we may still expect to find individuals within the patch sharing longer haplotypes around the selected locus than with individuals elsewhere, since selection against migrants decreases mean coalescence times within the patch near the selected locus. The literature on clines (e.g., [44]) has important information, but more work is needed to provide robust estimates for these processes. Questions about the genomic length-scale of signals of sweeps shared by migration have also been addressed in discrete population settings [55, 56], reviewed in [57]. This work has shown that the length of the shared swept haplotype is often significantly shorter than the sweep within each patch, resulting in a pattern of shoulders of elevated F_ST between adapted populations some distance away from the shared selected allele. It would be of interest to see how similar patterns can arise in a continuous population setting, as a way of uniting these results.

Long distance migration

We have also ignored the possibility of very long distance migration, instead focusing on local dispersal (hence Gaussian by the central limit theorem). However, dispersal distributions can be very heavy tailed, with a small fraction of individuals moving very long distances indeed [22, 58]. In addition, over long time-scales, very rare chance events (mice carried off by hurricanes and the like; [59, 60]) could play a role in spreading migrant alleles if adaptation by other means is sufficiently unlikely. Such tail events could greatly increase the probability of shared adaptation above that predicted by our model. Furthermore, if adaptive alleles do move between distant patches via rare, long distance migration then they will be associated with a much longer shared haplotype than predicted by Eq (19). As such, we view our results as a null model by which the contribution of long distribution migrants to adaptation could be empirically judged.

Other geographic models

We have studied circular patches of habitat at long distances from each other. Real habitat geometry can be much more complex, e.g., with archipelagos of patches of varying sizes, or patches connected by long, skinny corridors, for instance. The work of [61] comes closest to a general theory of balanced polymorphisms in such habitats. It is possible that our techniques could be applied in their much more general setting, as both are based, fundamentally, on branching process approximations. It is also interesting to think about the probability of convergent adaptation to continuously varying environments, e.g. replicated environmental clines.

Concluding thoughts

The falling cost of population genomic sequencing means that we will soon have the opportunity to study the interplay of adaptation with geography and ecology across many populations within a species. Our work suggests that even quite geographically close populations may be forced to locally adapt by repeated, convergent, de novo mutation when migration is geographically limited and selective pressures are divergent. Thus, systems where populations have been repeatedly subject to strong local selection pressures may offer the opportunity to study highly replicated convergent adaptation within a similar genetic background [1]. Such empirical work will also strongly inform our understanding of the ability of gene flow to keep ecologically similar populations evolving in concert [62]. Our results suggest that adaptation to shared environments is certainly no guarantee of a shared genetic basis to adaptation, suggesting that rapid adaptation to a shared environment could potentially drive speciation if the alleles that spread in each population fail to work well together [63].

Materials & Methods

Simulation methods

First we briefly describe the simulations we used for illustration and validation (the R code used is provided in S1 Scripts and at http://github.com/petrelharp/spatial_selection). We simulated forward-time dynamics of the number of alleles of each type in a rectangular grid (either one -⁠ or two-dimensional) of demes with fixed size N. Each generation, each individual independently chose to reproduce or not with a probability r depending on her type and location in the grid; locally beneficial alleles were more likely to reproduce. Each extant individual then either remained in the same location with probability 1 −⁠ m or else migrated a random number of steps in a uniformly chosen cardinal direction; for most simulations m = 0.2 and the probability of migrating k steps was proportional to 2^−k for 1 ≤ k ≤ 5. In 2D, diagonal steps were also used. This gave us values of σ = 0.95 deme spacings in 1D and σ = 0.74 in 2D. Once migrants were distributed, each deme was uniformly resampled back down to N individuals. (Although we described the simulation in terms of individuals, we kept track only of total numbers in an equivalent way.)

The base probability of reproduction in each generation in simulations for type b alleles was r = 0.3; this was then multiplied by 1 + s to get the probability of reproduction for type B, where the value of s is either s_m or s_p depending on the individual’s location. This determines the values of s_m and s_p reported in the figures, and do not depend on the basic rate of reproduction. However, to obtain values for s_p and s_m when comparing theory to simulation, we computed the rate of intrinsic growth, i.e., the s so that the numbers of B alleles when rare would change by e^st after t generations in the absence of migration. (The resulting values are close to the first notion of s, but give better agreement with theory, which uses the second definition.)

To sample lineages, we first simulated the population dynamics forwards in time, then sampled lineages back through time by, in each generation, moving each lineage to a new deme with probability proportional to the reverse migration probability weighted by the number of B alleles in that deme in the previous time step. If more than one lineage was found in a deme with n alleles of type B, then each lineage picked a label uniformly from 1 … n, and those picking the same label coalesced. Since reproduction is Poisson, this correctly samples from the distribution of lineages given the population dynamics.

Numerical calculation of the probability of establishment

When rare, copies of a new mutant allele are approximately independent and experience a uniform selective benefit; and can therefore be treated as a branching process. Furthermore, whether or not a new, beneficial mutation establishes or is lost to demographic stochasticity is determined by this initial phase where it is rare. Fortunately, the probability that a branching process dies out can be found as a fixed point of the generating function of the process [38]. Therefore, we calculated explicitly the generating function for a spatial branching process with nearest-neighbor migration on a one-dimensional lattice and a Poisson number of offspring with mean 1 + s, where s could vary by location, and iterated this forward to convergence to obtain 1 −⁠ p(x), the probability a single mutation appearing at x would fail to establish. We considered two situations: where s is a step function, and where it has a linear transition. These solutions are shown, and parameters described, in Fig 8.

Fig. 8. The probability of establishment of a new mutant allele as a function of distance from the edge of the region where it is beneficial, in an abrupt transition (left) and a gradual transition (right).

Here (and at other parameter choices) we see that the probability of establishment p(x) goes to the equilibrium value (approximately p_e = 2s/ξ²) within the patch; the transition is fairly symmetrical about the edge of the patch, even if the edge of the patch is not sharp. Additional experimentation indicated that the fit remains equally good for other parameter values, even if migration can move further than one deme and offspring numbers are not Poisson. This lends credence to our approximation that the integral of p(x) over the entire range is close to p_e multiplied by the area of the patch.

The equilibrium frequency

For completeness, and clarity as to the scalings on the relevant parameters, here we provide a derivation of the differential equations referred to above Eq (3), and establish the asymptotics given in that equation. One route to the “equilibrium frequency” of the allele outside the range where it is advantageous is as follows; see [9] or [29] (or [64] or [65] or [24]) for other arguments in equivalent models, and see [66] and/or [67] for a general framework for the stochastic processes below.

Suppose that the population is composed of a finite number of small demes of equal size N arranged in a regular grid, and that selection (for or against) the allele is given by the function s(x), with x denoting the spatial location. Each individual at location x reproduces at random, exponentially distributed intervals, producing a random number of offspring with distribution given by X who then all migrate to a new location chosen randomly from the distribution given by x + R, where they replace randomly chosen individuals. If x + R is outside of the range, then they perish. Each individual’s time until reproduction is exponentially distributed: the reproduction rate is 1 if it carries the original allele, or is 1 + s(x) if it carries the mutant allele. Suppose that the number of offspring X has mean μ; the variance of X will not enter into the formula (but assume X is well-behaved). Also suppose that the migration displacement R has mean zero and variance σ²; in more than one dimension, we mean that the components of the dispersal distance are uncorrelated and each have variance σ².

Let Φ t N ( x ) be the proportion of mutant alleles present at location x at time t, and Φ_t(x) the process obtained by taking N → ∞ (which we assume exists). Denote by δ_x a single unit at location x, so that e.g. Φ t N + δ x / N is the configuration after a mutant allele has been added to location x. For 0 ≤ ϕ ≤ 1, we also denote by X ¯ ϕ the random number of mutant alleles added if X new offspring carrying mutant alleles replace randomly chosen individuals in a deme where the mutant allele is at frequency ϕ (i.e. hypergeometric with parameters (X, Nϕ, N(1 −⁠ ϕ))); similarly, X ˜ ϕ is the number lost if the new offspring do not carry the allele (i.e. hypergeometric with parameters (X, N(1 −⁠ ϕ), Nϕ)). (We like to think of Φ t N as a measure, but it does not hurt to think of Φ^N as a vector; we aren’t providing the rigorous justification here.) Then the above description implies that for any sufficiently nice function f(Φ) that

In the final expectation, R and Φ are independent. This follows by taking first-order terms in 1/N in the Taylor series for f, and the fact that E [ X ¯ ϕ ] = ϕ μ and E [ X ˜ ϕ ] = ( 1 -⁠ ϕ ) μ. We can see two things from this: First, since this is a first-order differential operator, the limiting stochastic process Φ obtained as N → ∞ is in fact deterministic (check by applying to f(Φ) = Φ(x)² to find the variance). Second, if we want to rescale space as well to get the usual differential equation, we need to choose Var[R] = σ² and s(x) to be of the same, small, order; this is another way of seeing that σ / s is the relevant length scale (as noted by [9]). More concretely, suppose that the grid size is ϵ → 0, that Var[R] = (σϵ)², and that the strength of selection is s(x)ϵ, suppose that Φ_t(x) is deterministic and twice differentiable, and let ξ(t, x) = Φ_t/ϵ(x); then the previous equation with f(Φ) = Φ(x) converges to the familiar form:

Here we have taken first the population size N → ∞ and then the grid size ϵ → 0; we could alternatively take both limits together, but not if ϵ goes to zero too much faster than N grows. One reason for this is that at finite N, the process Φ_t is an irreducible finite-state Markov chain with absorbing states at 0 and 1; therefore, the inevitable outcome is extinction of one type or another, which is not the regime we want to study.

In one dimension, we are done (and discuss exact solutions in S1 Text); in higher dimensions, we are more interested in the mean frequency at a given distance r from a patch. If we take a radially symmetric patch centered at the origin (so s only depends on r), and let ξ(t, r) denote the mean occupation frequency at distance r, then the polar form of the Laplacian in d dimensions gives us that Eq (22) is

Asymptotic solution for the equilibrium frequency

A radially symmetric equilibrium frequency ξ(t, x) = q(∣x∣), with s(r) = −s < 0 for all r > r₀, solves for r₀ < r < ∞,

Since q(r) → 0 as r → ∞, so q(r) (1 −⁠ q(r)) ≈ q(r), it can be shown that the true equilibrium frequency q is close, for large r, to the solution to

This has general solution given by a modified Bessel function: using [68] 8.494.9, the general solution is

where C′ and r₁ are chosen to match boundary conditions. Asymptotics of Bessel functions ([68], 8.451.6) then imply that

where C is a different constant.

Hitting and occupation

Here we make a more precise argument to back up Expression (11) for the migration rate. The argument made above in section Heuristics applies to general migration mechanisms, since it relies only on a decomposition of the migrant families upon hitting the new patch; but it is also imprecise in subtle ways that are difficult to formalize. Here we take a somewhat different tack, supposing that it suffices to model the spatial movement of a migrant family by following only the motion of the “trunk” (i.e., the red line in Fig 3), and supposing this motion is Brownian, with variance σ² per generation. (Recall σ is the dispersal distance.) We then use facts about Brownian motion and branching processes, to compute more precise versions of Eqs (4) and (5).

We are approximating the dynamics of the focal allele in the region further away than r₀ from the patch as the sum of independent migrant families, each of whose dynamics are given by a spatial branching process (as depicted in Fig 3). Call B(r₀) the region closer than r₀ to the patch, and ∂B(r₀) its boundary. Denote by γ(x) the mean rate of outflux of migrant families from a point x ∈ ∂B(r₀), i.e., the time-averaged density of individuals near a point x in ∂B(r₀) that are the founders of new migrant families. Expressions (4) and (5) are a simple product of the “outflux of families”, when in fact they should be an integral of γ(x) over possible locations. However, it will turn out that the integrand of Eq (5) is well-approximated by a constant multiple of the integrand of Eq (4).

First consider Eq (4) for the equilibrium frequency. Suppose that Z is one such spatial branching process as above in section The genealogy of migrant families, started at time 0 with a single individual at x, and write S for the new patch. The mean occupation measure of Z in the region S, which we denote u(x, S), can be thought of informally as the expected total number of offspring of a family beginning at x that ever live in S. Let q(S) = ∫_S q(y)dy denote the total equilibrium frequency in S. This is decomposed in Expression (4) as the sum of mean occupation densities of a constant outflux of branching processes from ∂B(r₀):

Now we will decompose u(x, S) under the assumption that the marginal distribution of the spatial motion of a single lineage is Brownian. Let B_t be a Brownian motion with variance σ², and τ_† an independent Exponential(s_m) time. The mean occupation time of Z spent in a region S is,

where P x gives probabilities for Brownian motion begun at x (i.e., B₀ = x), and likewise E x. If we define τ_S to be the hitting time of S by the Brownian motion B, and μ_S(x) to be the hitting distribution of ∂S by B_{τ_S} conditioned on τ_S < τ_†, by the strong Markov property this is equal to

where now E μ denotes expectations for Brownian motion for which the distribution of B₀ is μ, and g(A) is defined to be the latter expectation, which does not depend on x if S is circular (with area A).

This form we can now compare to the Expression (5) for the outflux of successful migrants. Consider the probability that a migrant family beginning with a single individual at x will ever establish in the new patch. It would be possible to analyze this probability directly, as in [29]; but here we take a simpler route, approximating this by the chance that the trunk hits the new patch, multiplied by the chance that at least one member of the family escapes demographic stochasticity and successfully establishes in the new patch. Write h(x, S) for the probability that the Brownian trunk hits the patch S before the family dies out, and f(S) for the chance that the family manages to establish in the new patch, given that it successfully arrives. (This is approximately independent of x.) As for q(S) above, Expression (5) is properly an integral against the outflux γ(x):

If we make the approximation that Z hits the new patch only if the trunk of Z does, and recall that 1 −⁠ k_e(t) is the chance that Z survives for t generations, then

Therefore, we have that

Since the integrands in Expressions (30) and (38) only differ by a factor that (at least asymptotically) does not depend on the distance between x and S, we can obtain the migration rate by multiplying the equilibrium frequency by this factor. This factor will not depend on A, because although f(A) and g(A) in principle depend on the patch size (and geometry), the dependence is very weak. For instance, the width of a circular patch only very weakly affects the chance of establishment of a new migrant that appears on its edge, as long as the patch is large enough.

By Eqs (30) and (38),

Once we show that g(A) ≈ 1/(2s_m), we will have arrived at the result, Eq (10).

The function g(A) is the expected amount of time that a Brownian motion begun on the edge of a disk of area A is expected to spend inside the disk before τ_†. This is integral of the Green function for the Bessel process of the appropriate order, so using [45], and letting w be the width of the patch, in d = 1,

and in d = 2, by [68] 5.56.2,

where K₀ and K₁ are modified Bessel functions of the second kind. In either case, g(A) ≈ 1/(2s_m), which is the approximation we use in the main text.

The integral q(S)

In the development above we need to approximate the integral of Eq (3) across the area occupied by the new patch, q(S) = ∫_S q(x)dx. The precise answer depends on the shape and orientation of the patches; but we aim for a usable approximation, primarily in terms of the shortest distance from the old patch to the new patch, denoted R, and assuming R is large enough we can take Eq (3) as an equality.

Since q(R) decreases as ∣x∣ increases,

where A is the area of S. This will be a good approximation if S is small.

In one dimension, if S has length ℓ,

In two dimensions, suppose that T is a rectangle enclosing S with one axis aligned towards the original patch and transverse width w. Then

If we absorb π into the constant C, this is of the form ( width ) × ( σ / 2 s m ) × q ( R ), i.e., roughly the area, after replacing the length by σ / 2 s m.

In both cases, the upper bound is q(S) ≤ A′ q(x), where A′ is the area of the parts of S that are no more than R + σ / 2 s m away from the original patch. Lower bounds could be obtained along similar lines.

Supporting Information

Zdroje

1. Stern DL. The genetic causes of convergent evolution. Nat Rev Genet. 2013 Nov;14(11):751–764. 24105273

2. Zhen Y, Aardema ML, Medina EM, Schumer M, Andolfatto P. Parallel Molecular Evolution in an Herbivore Community. Science. 2012;337(6102):1634–1637. Available from: http://www.sciencemag.org/content/337/6102/1634.abstract. doi: 10.1126/science.1226630 23019645

3. Martin A, Orgogozo V. The loci of repeated evolution: a catalog of genetic hotspots of phenotypic variation. Evolution. 2013;67(5):1235–1250. Available from: http://dx.doi.org/10.1111/evo.12081. 23617905

4. Arendt J, Reznick D. Convergence and parallelism reconsidered: what have we learned about the genetics of adaptation? Trends Ecol Evol. 2008 Jan;23(1):26–32. doi: 10.1016/j.tree.2007.09.011 18022278

5. Pennings PS, Hermisson J. Soft Sweeps II—Molecular Population Genetics of Adaptation from Recurrent Mutation or Migration. Mol Biol Evol. 2006;p. msj117. Available from: http://mbe.oxfordjournals.org/cgi/content/abstract/msj117v1.

6. Ralph P, Coop G. Parallel Adaptation: One or Many Waves of Advance of an Advantageous Allele? Genetics. 2010;186(2):647–668. Available from: http://www.genetics.org/cgi/content/abstract/186/2/647. doi: 10.1534/genetics.110.119594 20660645

7. Orr HA, Betancourt AJ. Haldane’s sieve and adaptation from the standing genetic variation. Genetics. 2001 Feb;157(2):875–884. 11157004

8. Hermisson J, Pennings PS. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics. 2005 Apr;169(4):2335–2352. doi: 10.1534/genetics.104.036947 15716498

9. Slatkin M. Gene Flow and Selection in a Cline. Genetics. 1973;75(4):733–756. Available from: http://www.genetics.org/cgi/content/abstract/75/4/733. 4778791

10. Kruckeberg AR. Intraspecific Variability in the Response of Certain Native Plant Species to Serpentine Soil. American Journal of Botany. 1951;38(6):pp. 408–419. doi: 10.2307/2438248

11. Macnair MR. Why the evolution of resistance to anthropogenic toxins normally involves major gene changes: the limits to natural selection. Genetica. 1991;84(3):213–219. doi: 10.1007/BF00127250

12. Schat H, Vooijs R, Kuiper E. Identical Major Gene Loci for Heavy Metal Tolerances that Have Independently Evolved in Different Local Populations and Subspecies of Silene vulgaris. Evolution. 1996;50(5):pp. 1888–1895. doi: 10.2307/2410747

13. Turner TL, Bourne EC, Von Wettberg EJ, Hu TT, Nuzhdin SV. Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils. Nat Genet. 2010 Jan;42(3):260–3. doi: 10.1038/ng.515 20101244

14. Benson SB. Concealing coloration among some desert rodents of the southwestern United States. No. v. 40 in University of California publications in zoology. University of California Press; 1933. Available from: http://books.google.com/books?id=Lis-AQAAIAAJ.

15. Dice LR, Blossom PM. Studies of Mammalian Ecology in Southwestern North America With Special Attention to the Colors of Desert Mammals. Carnegie Institution; 1937.

16. Kaufman DW. Adaptive Coloration in Peromyscus polionotus: Experimental Selection by Owls. Journal of Mammalogy. 1974;55(2):pp. 271–283. doi: 10.2307/1378997

17. Hoekstra HE, Nachman MW. Different genes underlie adaptive melanism in different populations of rock pocket mice. Mol Ecol. 2003 May;12 : 1185–1194. Available from: http://www3.interscience.wiley.com/journal/118890526/abstract. doi: 10.1046/j.1365-294X.2003.01788.x 12694282

18. Dice LR. Ecologic and Genetic Variability within Species of Peromyscus. The American Naturalist. 1940;74(752):pp. 212–221. Available from: http://www.jstor.org/stable/2457573. doi: 10.1086/280889

19. Steiner CC, Rompler H, Boettger LM, Schoneberg T, Hoekstra HE. The Genetic Basis of Phenotypic Convergence in Beach Mice: Similar Pigment Patterns but Different Genes. Mol Biol Evol. 2009;26(1):35–45. Available from: http://mbe.oxfordjournals.org/cgi/content/abstract/26/1/35. doi: 10.1093/molbev/msn218 18832078

20. Kingsley EP, Manceau M, Wiley CD, Hoekstra HE. Melanism in Peromyscus is caused by independent mutations in agouti. PLoS ONE. 2009;4:e6435. doi: 10.1371/journal.pone.0006435 19649329

21. Rosenblum EB, Römpler H, Schöneberg T, Hoekstra HE. Molecular and functional basis of phenotypic convergence in white lizards at White Sands. Proceedings of the National Academy of Sciences. 2010;107(5):2113–2117. Available from: http://www.pnas.org/content/107/5/2113.abstract. doi: 10.1073/pnas.0911042107

22. Levin SA, Muller-Landau HC, Nathan R, Chave J. The Ecology and Evolution of Seed Dispersal: a theoretical perspective. Annu Rev Ecol Evol Syst. 2003;34 : 575–604. doi: 10.1146/annurev.ecolsys.34.011802.132428

23. Hallatschek O, Fisher DS. Acceleration of evolutionary spread by long-range dispersal. Proc Natl Acad Sci U S A. 2014 Nov;111(46):4911–4919. Available from: http://www.ncbi.nlm.nih.gov/pubmed/25368183. doi: 10.1073/pnas.1404663111

24. Haldane JBS. The theory of a cline. J Genet. 1948 Jan;48(3):277–284. doi: 10.1007/BF02986626 18905075

25. Fisher RA. Gene Frequencies in a Cline Determined by Selection and Diffusion. Biometrics. 1950;6(4):pp. 353–361. doi: 10.2307/3001780 14791572

26. Nagylaki T. Conditions for the existence of clines. Genetics. 1975;80(3):595–615. Available from: http://www.genetics.org/content/80/3/595.abstract.

27. Conley C. An application of Wazewski’s method to a non-linear boundary value problem which arises in population genetics. Journal of Mathematical Biology. 1975;2(3):241–249. doi: 10.1007/BF00277153

28. Lenormand T. Gene flow and the limits to natural selection. Trends in Ecology & Evolution. 2002;17(4):183—189. doi: 10.1016/S0169-5347(02)02497-7

29. Barton NH. The probability of establishment of an advantageous mutant in a subdivided population. Genetics Research. 1987;50(1):35–40. doi: 10.1017/S0016672300023314

30. Pollak E. On the Survival of a Gene in a Subdivided Population. Journal of Applied Probability. 1966;3(1):142–155. doi: 10.2307/3212043

31. Haldane JBS. A Mathematical Theory of Natural and Artificial Selection, Part V: Selection and Mutation. Mathematical Proceedings of the Cambridge Philosophical Society. 1927 7;23(07):838–844. doi: 10.1017/S0305004100015644

32. Fisher RA. The genetical theory of natural selection. Oxford: Oxford University Press; 1930. Available from: http://www.archive.org/details/geneticaltheoryo031631mbp.

33. Maruyama T. On the fixation probability of mutant genes in a subdivided population. Genetics Research. 1970;15(02):221–225. doi: 10.1017/S0016672300001543

34. Cherry JL, Wakeley J. A diffusion approximation for selection and drift in a subdivided population. Genetics. 2003 Jan;163(1):421–428. 12586727

35. Cantrell RS, Cosner C. Diffusive logistic equations with indefinite weights: population models in disrupted environments. Proceedings of the Royal Society of Edinburgh, Section: A Mathematics. 1989 0;112(3–4):293–318. doi: 10.1017/S030821050001876X

36. Lou Y, Yanagida E. Minimization of the principal eigenvalue for an elliptic boundary value problem with indefinite weight, and applications to population dynamics. Japan J Indust Appl Math. 2006;23(3):275–292. Available from: http://projecteuclid.org/euclid.jjiam/1197390801. doi: 10.1007/BF03167595

37. NEQwiki. NEQwiki, the nonlinear equations encyclopedia; 2013. Accessed December 17, 2014. Available from: http://www.primat.mephi.ru/wiki/ow.asp?Korteweg-de_Vries_equation.

38. Jagers P. Branching processes with biological applications. Wiley Series in Probability and Statistics: Applied Probability and Statistics Section Series. Wiley; 1975.

39. Geiger J. Elementary New Proofs of Classical Limit Theorems for Galton-Watson Processes. Journal of Applied Probability. 1999;36(2):pp. 301–309. Available from: http://www.jstor.org/stable/3215457. doi: 10.1239/jap/1032374454

40. Aldous DJ. Exchangeability and related topics. In: École d’été de probabilités de Saint-Flour, XIII—1983. vol. 1117 of Lecture Notes in Math. Berlin: Springer; 1985. p. 1–198. Available from: http://www.springerlink.com/content/c31v17440871210x/fulltext.pdf.

41. Donnelly P, Joyce P. Continuity and weak convergence of ranked and size-biased permutations on the infinite simplex. Stochastic Process Appl. 1989;31(1):89–103. doi: 10.1016/0304-4149(89)90104-X

42. Maynard Smith J, Haigh J. The hitch-hiking effect of a favourable gene. Genet Res. 1974 Feb;23(1):23–35. Available from: http://www.ncbi.nlm.nih.gov/pubmed/4407212. doi: 10.1017/S0016672300014634

43. Barton NH, Etheridge AM, Kelleher J, Véber A. Genetic hitchhiking in spatially extended populations. Theoretical Population Biology. 2013;(0):–. Available from: http://www.sciencedirect.com/science/article/pii/S0040580912001359.

44. Barton N. Gene flow past a cline. Heredity. 1979 Dec;43(3):333–339. doi: 10.1038/hdy.1979.86

45. Borodin AN, Salminen P. Handbook of Brownian motion: facts and formulae. Springer; 2002.

46. Nachman MW, Hoekstra HE, D’Agostino SL. The genetic basis of adaptive melanism in pocket mice. Proc Natl Acad Sci U S A. 2003 Apr;100(9):5268–5273. doi: 10.1073/pnas.0431157100 12704245

47. Hoekstra HE, Krenz JG, Nachman MW. Local adaptation in the rock pocket mouse (Chaetodipus intermedius): natural selection and phylogenetic history of populations. Heredity. 2005 Nov;94(2):217–228. doi: 10.1038/sj.hdy.6800600 15523507

48. Hoekstra HE, Drumm KE, Nachman MW. Ecological genetics of adaptive color polymorphism in pocket mice: geographic variation in selected and neutral genes. Evolution. 2004 Jun;58(6):1329–1341. Available from: http://onlinelibrary.wiley.com/doi/10.1111/j.0014-3820.2004.tb01711.x/abstract. doi: 10.1554/03-418 15266981

49. French NR, Tagami TY, Hayden P. Dispersal in a Population of Desert Rodents. Journal of Mammalogy. 1968;49(2):pp. 272–280. doi: 10.2307/1377984

50. Allred DM, Beck DE. Range of Movement and Dispersal of Some Rodents at the Nevada Atomic Test Site. Journal of Mammalogy. 1963;44(2):pp. 190–200. doi: 10.2307/1377452

51. Hoekstra HE. Genetics, development and evolution of adaptive pigmentation in vertebrates. Heredity (Edinb). 2006 Sep;97(3):222–234. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16823403. doi: 10.1038/sj.hdy.6800861

52. Messer PW, Petrov DA. Population genomics of rapid adaptation by soft selective sweeps. Trends Ecol Evol (Amst). 2013 Nov;28(11):659–669. doi: 10.1016/j.tree.2013.08.003

53. Wilson BA, Petrov DA, Messer PW. Soft selective sweeps in complex demographic scenarios. Genetics. 2014 Oct;198(2):669–684. doi: 10.1534/genetics.114.165571 25060100

54. Ralph PL, Coop G. The role of standing variation in geographic convergent adaptation. The American Naturalist. 2015;.

55. Slatkin M, Wiehe T. Genetic hitch-hiking in a subdivided population. Genet Res. 1998 Apr;71(2):155–160. doi: 10.1017/S001667239800319X 9717437

56. Kim Y, Maruki T. Hitchhiking effect of a beneficial mutation spreading in a subdivided population. Genetics. 2011 Sep;189(1):213–226. Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3176130/. doi: 10.1534/genetics.111.130203 21705748

57. Barton NH. Genetic hitchhiking. Philos Trans R Soc Lond B Biol Sci. 2000 Nov;355(1403):1553–1562. Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1692896/. doi: 10.1098/rstb.2000.0716 11127900

58. Reynolds AM, Rhodes CJ. The Lévy flight paradigm: random search patterns and mechanisms. Ecology. 2009 Apr;90(4):877–887. doi: 10.1890/08-0153.1 19449680

59. Censky EJ, Hodge K, Dudley J. Over-water dispersal of lizards due to hurricanes. Nature. 1998 Oct;395(6702):556–556. doi: 10.1038/26886

60. Nathan R, Schurr FM, Spiegel O, Steinitz O, Trakhtenbrot A, Tsoar A. Mechanisms of long-distance seed dispersal. Trends in Ecology & Evolution. 2008;23(11):638—647. Available from: http://www.sciencedirect.com/science/article/pii/S0169534708002723. doi: 10.1016/j.tree.2008.08.003

61. Cantrell RS, Cosner C. Diffusive logistic equations with indefinite weights: population models in disrupted environments. II. SIAM J Math Anal. 1991;22(4):1043–1064. doi: 10.1137/0522068

62. Sexton JP, Hangartner SB, Hoffmann AA. Genetic isolation by environment or distance: Which pattern of gene flow is most common? Evolution. 2013;Available from: http://dx.doi.org/10.1111/evo.12258. 24111567

63. Kondrashov AS. Accumulation of Dobzhansky–Muller incompatibilities within a spatially structured population. Evolution. 2003 Jan;57(1):151–153. doi: 10.1554/0014-3820(2003)057%5B0151:AODMIW%5D2.0.CO;2 12643575

64. Kolmogorov A, Petrovskii I, Piscunov N. A study of the equation of diffusion with increase in the quantity of matter, and its application to a biological problem. In: Selected Works of A.N. Kolmogorov: Mathematics and mechanics. vol. 25 of Mathematics and its Applications (Soviet Series). Dordrecht: Kluwer Academic Publishers Group; 1991. p. 1–25.

65. Fisher R. The wave of advance of advantageous genes. Ann Eugenics. 1937;7 : 353–369. Available from: http://digital.library.adelaide.edu.au/coll/special/fisher/152.pdf. doi: 10.1111/j.1469-1809.1937.tb02153.x

66. Etheridge AM. An introduction to superprocesses. vol. 20 of University Lecture Series. Providence, RI: American Mathematical Society; 2000.

67. Dawson DA. Measure-valued Markov processes. In: École d’Été de Probabilités de Saint-Flour XXI—1991. vol. 1541 of Lecture Notes in Math. Berlin: Springer; 1993. p. 1–260.

68. Gradshteyn IS, Ryzhik IM. Table of integrals, series, and products. Seventh ed. Elsevier/Academic Press, Amsterdam; 2007. Translated from the Russian, Translation edited and with a preface by Alan Jeffrey and Daniel Zwillinger.