Jared G. Galloway
Tree Sequence Applications (Accepted - Molecular Ecology Resources)
There is an increasing demand for evolutionary models to incorporate relatively realistic dynamics, ranging from selection at many genomic sites to complex demography, population structure, and ecological interactions. Such models can generally be implemented as individual-based forward simulations, but the large computational overhead of these models often makes simulation of whole chromosome sequences in large populations infeasible. This situation presents an important obstacle to the field that requires conceptual advances to overcome. The recently developed tree-sequence recording method (Kelleher et al., 2018), which stores the genealogical history of all genomes in the simulated population, could provide such an advance. This method has several benefits: (1) it allows neutral mutations to be omitted entirely from forward-time simulations and added later, thereby dramatically improving computational efficiency; (2) it allows neutral burn-in to be constructed extremely efficiently after the fact, using "recapitation"; (3) it allows direct examination and analysis of the genealogical trees along the genome; and (4) it provides a compact representation of a population's genealogy that can be analyzed in Python using the msprime package. We have implemented the tree-sequence recording method in SLiM 3 (a free, open-source evolutionary simulation software package) and extended it to allow the recording of non-neutral mutations, greatly broadening the utility of this method. To demonstrate the versatility and performance of this approach, we showcase several practical applications that would have been beyond the reach of previously existing methods, opening up new horizons for the modeling and exploration of evolutionary processes.
Threespine Stickleback Study (Almost there)
Threespine stickleback fish provide one of the most striking examples of local adaptation. This hemisphere- wide metapopulation includes both marine populations and a large number of smaller freshwater populations that have apparently repeatedly adapted to freshwater conditions often using the same genetic basis. In this paper, we use simulations motivated by stickleback populations to examine what amounts of gene flow favor stable metapopulation polymorphism with allele sharing, and to further dissect the underlying dynamics. We find that rapid, repeated adaptation using alleles maintained at low frequency by migration-selection balance (the “transporter hypothesis”) occurs over a realistic range of intermediate rates of gene flow, between slow, independent adaptation with low gene flow and large migration load at high gene flow. This is mediated mainly by the total downstream influx of alleles. We do not see evidence for strong effects encouraging genomic clustering of causal alleles or towards particular dominance coefficients. FST scans for adaptive alleles are more likely to succeed with higher rates of gene flow, The results support existing theory of local adaptation, and provide a more concrete look at a particular, empirically motivated example.
Large-scale, forward-moving evolutionary simulations are starting to play a key role in research surrounding populations genetics. Simulations can help biologists understand data we observe from natural populations in fields such as molecular ecology, evolutionary genetics, and conservation biology. However, the magnitude of these simulations is limited by the current capabilities of the hardware they are running on. This draws interest in exploring methods that make the results of large-scale simulations more feasible. Earlier this year, a strategy was introduced allowing simulations to avoid tracking and propagating neutral mutations as a consequence of storing the entire genealogical history of sampled genotypes. In this thesis, we use succinct tree sequences, introduced by msprime, to explore and implement the method of genealogical tree sequence recording (TreeSeq) in forward-moving evolutionary simulations. We then analyze the runtime performance gain of one to two orders of magnitude as a result of implementing this strategy into a popular evolutionary framework, SLiM. Finally, we explain the workflow, algorithms, and data structures behind the implementation.
firstname.lastname@example.org | (406) 579-6768 | GitHub