Volume 3, Issue 3
APPLICATION
Free Access

abc: an R package for approximate Bayesian computation (ABC)

Katalin Csilléry

Corresponding Author

Irstea, UR EMGR, 2 rue de la Papeterie, F‐38402 Saint Martin d'Hères, France

Correspondence author. E‐mail: kati.csillery@gmail.comSearch for more papers by this author
Olivier François

Computational and Mathematical Biology Team, Laboratoire Techniques de l'Ingénierie Médicale et de la Complexité, Université Joseph Fourier, Grenoble 1, Centre National de la Recherche Scientifique UMR5525, F‐38706 La Tronche, France

Search for more papers by this author
Michael G. B. Blum

Computational and Mathematical Biology Team, Laboratoire Techniques de l'Ingénierie Médicale et de la Complexité, Université Joseph Fourier, Grenoble 1, Centre National de la Recherche Scientifique UMR5525, F‐38706 La Tronche, France

Search for more papers by this author
First published: 31 January 2012
Citations: 246

Summary

1. Many recent statistical applications involve inference under complex models, where it is computationally prohibitive to calculate likelihoods but possible to simulate data. Approximate Bayesian computation (ABC) is devoted to these complex models because it bypasses the evaluation of the likelihood function by comparing observed and simulated data.

2. We introduce the R package ‘abc’ that implements several ABC algorithms for performing parameter estimation and model selection. In particular, the recently developed nonlinear heteroscedastic regression methods for ABC are implemented. The ‘abc’ package also includes a cross‐validation tool for measuring the accuracy of ABC estimates and to calculate the misclassification probabilities when performing model selection. The main functions are accompanied by appropriate summary and plotting tools.

3. R is already widely used in bioinformatics and several fields of biology. The R package ‘abc’ will make the ABC algorithms available to a large number of R users. ‘abc’ is a freely available R package under the GPL license, and it can be downloaded at http://cran.r‐project.org/web/packages/abc/index.html.

Introduction

In recent years, approximate Bayesian computation (ABC) has become a popular method for parameter inference and model selection under complex models, where the evaluation of the likelihood function is computationally prohibitive. ABC bypasses exact likelihood calculations via the use of summary statistics and simulations, which, in turn, allows the consideration of highly complex models. The name ABC was first coined by Beaumont et al. (2002) in population genetics, for inference under coalescent models, but its origin goes back to works by Tavaréet al. (1997); Pritchard et al. (1999). ABC is now increasingly applied especially in ecology or systems biology (for reviews of ABC methods and applications, see Beaumont 2010; Bertorelle et al. 2010; Csilléry et al. 2010). Software implementations of ABC dedicated to particular problems have already been developed in these fields (Anderson et al. 2005; Hickerson et al. 2007; Cornuet et al. 2008; Jobin & Mountain 2008; Tallmon et al. 2008; Lopes et al. 2009; Thornton 2009; Bray et al. 2010; Cornuet et al. 2010; Liepe et al. 2010; Wegmann et al. 2010; Huang et al. 2011).

The integration of ABC in a software package poses several challenges. First, data simulation, which is in the core of any ABC analysis, is specific to the model in question. Thus, many existing ABC software are specific to a particular class of models (Hickerson et al. 2007; Cornuet et al. 2008; Lopes et al. 2009) or even to the estimation of a particular parameter (Tallmon et al. 2008). Further, model comparison is an integral part of any Bayesian analysis; thus, it is essential to provide software, where users are able to fit different models to their data. Second, an ABC analysis often follows a trial–error approach, where users experiment with different models, ABC algorithms or summary statistics. Therefore, it is important that users can run different analyses using batch files, which contain each analysis as a sequence of commands. Third, ABC is subject to intensive research, and many new algorithms have been published in the past few years (Beaumont et al. 2002, 2009; Bortot et al. 2007; Sisson et al. 2007; Blum 2010). Thus, an ABC software should be flexible enough to accommodate the new developments of the field.

Here, we introduce a generalist R package ‘abc’, which aims to address the above challenges (R Development Core Team 2011). The price to pay for the generality and flexibility is that the simulation of data and the calculation of summary statistics are left to the users. However, simulation software might be called from an R session, which opens up the possibility for a highly interactive ABC analysis. For coalescent models, for instance, users can apply one of the many existing software for simulating genetic data such as ‘ms’ (Hudson 2002) or ‘fastsimcoal’ (Excoffier & Foll 2011). The calculation of summary statistics could be performed using either R or some specific software such as ‘msABC’ (Pavlidis et al. 2010), which runs ‘ms’ and calculates summary statistics from the output files. ABC methods have also been developed to handle full data (Sousa et al. 2009) – allele frequencies in population genetics – but the ‘abc’ package is dedicated to summary statistic approaches, which represent the bulk of the literature.

R provides many advantages in the context of ABC: (i) R already possesses the necessary tools to handle, analyse and visualise large data sets, (ii) sequences of R commands can be saved in a script file and (iii) R is a free and collaborative project; thus, new algorithms can be easily integrated to the package (e.g. via contributions from their authors).

Implementation

The main steps of an ABC analysis follow the general scheme of any Bayesian analysis: formulating a model, fitting the model to data (parameter estimation) and improving the model by checking its fit (posterior predictive checks) and comparing it to other models (Gelman et al. 2003; Csilléry et al. 2010). ‘abc’ provides functions for the inference and model comparison steps, and generic tools of R can be used for model checking.

To use the package, the following R objects should be prepared: a vector of the observed summary statistics, a matrix of the simulated summary statistics, where each row corresponds to a simulation and each column corresponds to a summary statistic, and finally, a matrix of the simulated parameter values, where each row corresponds to a simulation and each column corresponds to a parameter.

Parameter inference

For the sake of clarity, we recall the general scheme of parameter estimation with ABC. Suppose that we want to compute the posterior probability distribution of a univariate or multivariate parameter, θ. A parameter value θi is sampled from its prior distribution to simulate a data set yi, for i = 1,…,n where n is the number of simulations. A set of summary statistics S(yi) is computed from the simulated data and compared to the summary statistics obtained from the actual data S(y0) using a distance measure d. We consider the Euclidean distance for d, and the ‘abc’ package standardises each summary statistic with a robust estimate of the standard deviation (the median absolute deviation). If d(S(yi),S(y0)) (i.e. the distance between S(yi) and S(y0)) is less than a given threshold, the parameter value θi is accepted. To set a threshold for d, above which simulations are rejected, the user has to provide the tolerance rate, which is defined as the proportion of accepted simulations. The accepted θi’s form a sample from an approximation of the posterior distribution. The estimation of the posterior distribution can be improved by the use of regression techniques, which we detail in the following paragraph.

The function “abc” implements three ABC algorithms for constructing the posterior distribution from the accepted θi’s: a rejection method and two regression‐based correction methods that use either local linear regression (Beaumont et al. 2002) or neural networks (Blum & François 2010). When the rejection method (“rejection”) is selected, the accepted θi’s are considered as a sample from the posterior distribution (Pritchard et al. 1999). The two regression methods (“loclinear” and “neuralnet”) implement an additional step to correct for the imperfect match between the accepted, S(yi), and observed summary statistics, S(y0), using the following regression equation in the vicinity of S(y0)
image(eqn 1)
where m is the regression function and the εi’s are centred random variables with equal variance. Simulations that closely match S(y0) are given more weight by assigning to each simulation (θi,S(yi)) the weight K[d(S(yi),S(y0))], and the package implements different statistical kernels K. The local linear model (“loclinear”) assumes a linear function for m, while neural networks account for the non‐linearity of m and allow users to reduce the dimension of the set of summary statistics. Once the regression is performed, a weighted sample from the posterior distribution is obtained by correcting the θi’s as follows:
image(eqn 2)
where inline image is the estimated conditional mean and the inline images are the empirical residuals of the regression (Beaumont et al. 2002). Additionally, a correction for heteroscedasticity is applied, by default, in “abc”,
image(eqn 3)
where inline image is the estimated conditional standard deviation (Blum & François 2010).

The function “abc” returns an object of class “abc” that can be printed, summarised and plotted using the S3 methods of the R generic functions, “print”, “summary”, “hist” and “plot”. The function “print” returns a description of the object. The function “summary” calculates summaries of the posterior distributions, such as the mode, mean, median and credible intervals, taking into account the posterior weights, when appropriate. The “hist” function displays the histogram of the weighted posterior sample. The “plot” function generates various plots that allow the evaluation of the quality of estimation when one of the regression methods is used. The following plots are generated: a density plot of the prior distribution, a density plot of the posterior distribution estimated with and without regression‐based correction, a scatter plot of the Euclidean distances as a function of the parameter values and a normal Q–Q plot of the residuals from the regression. When the heteroscedastic regression model is used, a normal Q–Q plot of the standardised residuals is displayed (see Fig. 1 panel a).

image

Typical graphical outputs of the R ‘abc’ package (model selection and estimation of the effective population size Ne from population genetic data). (a) Parameter inference and regression diagnostics: plots show (clock‐wise) the prior distribution, the distances between observed and simulated summary statistics as a function of the parameter values (where red points indicate the accepted values), normal Q–Q plot of the residuals of the regression, and the posterior distribution obtained with and without the regression correction method (and the prior distribution, for reference). (b) Cross‐validation for parameter estimation: plot shows the estimated values as a function of true parameter values. Different colours correspond to different values of the tolerance rate. (c) Model misclassification: a graphical illustration of the confusion matrix for three models. The colours from dark to light grey correspond to models bott, const, exp, accordingly. If the simulations were perfectly classified, each bar would have a single colour of its own corresponding model. The following R code can be used to re‐generate these plots.
> library(abc)
> data(human)
> cv.modsel <‐ cv4postpr(models, stat.3pops.sim, nval=50, tol=.01, method=“mnlogistic”)
> plot(cv.modsel)
> stat.italy.sim <‐ subset(stat.3pops.sim, subset=models==“bott”)
> cv.res.reg <‐ cv4abc(data.frame(Na=par.italy.sim [,“Ne”]), stat.italy.sim,
+ nval=200, tols=c(.005,.001), method=“loclinear”)
> plot(cv.res.reg, caption=“Ne”)
> res <‐ abc(target=stat.voight[“italian”,], param=data.frame(Na=par.italy.sim [, “Ne”]),
+ sumstat=stat.italy.sim, tol=0.005, transf=c(“log”), method=“neuralnet”)
> plot(res, param=par.italy.sim [, “Ne”])

Finally, we note that alternative algorithms exist that sample from an updated distribution that is closer in shape to the posterior than to the prior (Marjoram et al. 2003; Beaumont et al. 2009; Wegmann et al. 2010). However, we do not implement these methods in the ‘abc’ package because they require the repeated use of the simulation software.

Posterior predictive checks

We strongly recommend that users perform posterior predictive checks after fitting their model to the data. There is no specific function in the package ‘abc’ for posterior predictive checks; nevertheless, the task can be easily carried out using R and the simulation software. A fully executable example using R and ‘ms’ can be found in the package's vignette. Briefly, to perform model checking, one can obtain replicates from the posterior distribution of the parameters using the function abc. Then, one can simulate the summary statistics a posteriori using the simulation software. In ABC, posterior predictive checks might use the summary statistics twice: once for sampling from the posterior distribution and once for comparing the marginal posterior predictive distributions to the observed values of the summary statistics. To avoid this circularity, we might consider using different summary statistics for posterior predictive checks than for parameter estimation, for example using the expected deviance function.

Cross‐validation

The function “cv4abc” performs a leave‐one‐out cross‐validation to evaluate the accuracy of parameter estimates and the robustness of the estimates to the tolerance rate. To perform cross‐validation, the ith simulation is randomly selected as a validation simulation, its summary statistic(s) S(yi) are used as pseudo‐observed summary statistics, and its parameters are estimated via “abc” using all simulations except the ith simulation. Ideally, the process is repeated n times, where n is the number of simulations (so‐called n‐fold cross‐validation). However, performing an n‐fold cross‐validation might take up too much time, so the cross‐validation is often performed for a subset of typically 100 randomly selected simulations. The “summary” S3 method of “cv4abc” computes the prediction error as
image(eqn 4)
where θi is the true parameter value of the ith simulated data set and inline image is the estimated parameter value (the posterior median). The “plot” function displays the estimated parameter values as a function of the true values (see Fig. 1 panel b).

Model selection

The function “postpr” implements model selection to approximate the posterior probability of a model M as Pr(M|S(y0)). Three different methods are implemented. With the rejection method (“rejection”), the approximate posterior probability of a given model is proportional to the proportion of accepted simulations under this model. The two other methods are based on multinomial logistic regression (“mnlogistic”) or neural networks (“neuralnet”). In these two approaches, the model indicator is treated as the response variable of a polychotomous regression, where the summary statistics are the independent variables (Beaumont 2008). Using neural networks can be efficient when highly dimensional statistics are used. Any of these methods are valid when the different models to be compared are, a priori, equally likely, and the same number of simulations are performed under each model. The “summary” S3 method for “postpr” displays the approximate posterior model probabilities, and calculates the ratios of model probabilities, the approximate Bayes factor, for all possible pairs of models (François et al. 2008).

A further function, “expected.deviance”, is implemented to guide the model selection procedure. The function computes an approximate expected deviance from the posterior predictive distribution. Thus, to use the function, users have to re‐use the simulation tool and to simulate data from the posterior parameter values. The method is particularly advantageous when it is used with one of the regression methods. Further details on the method can be found in François & Laval (2011), and fully worked out examples are provided in the package’s manual pages.

Computing misclassification errors

A cross‐validation tool is available for model selection as well via the function “cv4postpr”. The objective is to evaluate whether model selection with ABC is able to distinguish between the proposed models by making use of the existing simulations. The summary statistics from one of the simulations are considered as pseudo‐observed summary statistics and classified using all the remaining simulations. Then, if the summary statistics contain sufficient information to discriminate among models, one expects that a large posterior probability should be assigned to the model that generated the pseudo‐observed summary statistics. Two versions of the cross‐validation are implemented. The first version is a ‘hard’ model classification. We consider a given simulation as the pseudo‐observed data and assign it to the model for which “postpr” gives the highest posterior model probability. This procedure is repeated for a given number of simulations for each model. The results are summarised in a so‐called confusion matrix (Hastie et al. 2009). Each row of the confusion matrix represents the number of simulations under a given model, while each column represents the number of simulations assigned by “postpr”. If all simulations had been correctly classified, only the diagonal elements of the matrix would be non‐zero. The second version is called ‘soft’ classification. Here, we do not assign a simulation to the model with the highest posterior probability but average the posterior probabilities over many simulations for a given model. This procedure is again summarised as a matrix, which is similar to the confusion matrix. However, the elements of the matrix do not give model counts, but the average posterior probabilities across simulations for a given model. The matrices can be visualised with a bar plot using the “plot” S3 method for “cv4postpr” (see Fig. 1c).

Conclusions

We provide an R package ‘abc’ to perform model selection and parameter estimation via ABC. Integrating ‘abc’ within the R statistical environment offers high‐quality graphics and data visualisation tools. The R package implements recently developed non‐linear methods for ABC and is going to evolve as new algorithms and methods accumulate. We further direct our users to the package's vignette that contains a detailed worked‐through example of an ABC analysis for inferring ancestral human population size based on DNA sequence data.

Acknowledgements

We thank Mark Beaumont for kindly providing an R script that we used in the implementation of the functions abc and postpr. While working on this package, KC was funded by a post‐doctoral fellowship from the Université Joseph Fourier (ABC MSTIC) at the Computational and Mathematical Biology Team (BCM, TIMC‐IMAG) and then was hosted and financed by the Ecology and Evolution Laboratory (ENS, Paris, ANR‐06‐BDIV‐003).

      Number of times cited according to CrossRef: 246

      • Pakman: a modular, efficient and portable tool for approximate Bayesian inference, Journal of Open Source Software, 10.21105/joss.01716, 5, 47, (1716), (2020).
      • Differential divergence in autosomes and sex chromosomes is associated with intra‐island diversification at a very small spatial scale in a songbird lineage, Molecular Ecology, 10.1111/mec.15396, 29, 6, (1137-1153), (2020).
      • Shifting ecosystem connectivity during the Pleistocene drove diversification and gene‐flow in a species complex of Neotropical birds (Tityridae: Pachyramphus), Journal of Biogeography, 10.1111/jbi.13862, 47, 8, (1714-1726), (2020).
      • Modeling rate of adaptive trait evolution using Cox–Ingersoll–Ross process: An Approximate Bayesian Computation approach, Computational Statistics & Data Analysis, 10.1016/j.csda.2020.106924, (106924), (2020).
      • Extinction–immigration dynamics lag behind environmental filtering in shaping the composition of tropical dry forests within a changing landscape, Ecography, 10.1111/ecog.04870, 43, 6, (869-881), (2020).
      • A river runs through it: The causes, consequences, and management of intraspecific diversity in river networks, Evolutionary Applications, 10.1111/eva.12941, 13, 6, (1195-1213), (2020).
      • Approximate Bayesian Computation in Parameter Estimation of Building Energy Models, Proceedings of the 11th International Symposium on Heating, Ventilation and Air Conditioning (ISHVAC 2019), 10.1007/978-981-13-9528-4_40, (391-399), (2020).
      • An individual-based model to assess the spatial and individual heterogeneity of Brucella melitensis transmission in Alpine ibex, Ecological Modelling, 10.1016/j.ecolmodel.2020.109009, 425, (109009), (2020).
      • Discrepancies between genetic and ecological divergence patterns suggest a complex biogeographic history in a Neotropical genus, Ecology and Evolution, 10.1002/ece3.6227, 10, 11, (4726-4738), (2020).
      • Comparative phylogeographic inference with genome‐wide data from aggregated population pairs, Evolution, 10.1111/evo.13945, 74, 5, (808-830), (2020).
      • Projecting introgression from domestic cats into European wildcats in the Swiss Jura, Evolutionary Applications, 10.1111/eva.12968, 13, 8, (2101-2112), (2020).
      • Toward an Evolutionarily Appropriate Null Model: Jointly Inferring Demography and Purifying Selection, Genetics, 10.1534/genetics.119.303002, 215, 1, (173-192), (2020).
      • Patterns of genotype variation and demographic history in Lindera glauca (Lauraceae), an apomict‐containing dioecious forest tree, Journal of Biogeography, 10.1111/jbi.13874, 47, 9, (2002-2016), (2020).
      • Evolution between forest macrorefugia is linked to discordance between genetic and morphological variation in Neotropical passerines, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106849, 149, (106849), (2020).
      • Historical demography and climate driven distributional changes in a widespread Neotropical freshwater species with high economic importance, Ecography, 10.1111/ecog.04874, 43, 9, (1291-1304), (2020).
      • Congruent population genetic structures and divergence histories in anther‐smut fungi and their host plants Silene italica and the Silene nutans species complex, Molecular Ecology, 10.1111/mec.15387, 29, 6, (1154-1172), (2020).
      • , Angewandte Chemie, 10.1002/ange.201906756, 132, 18, (7048-7072), (2020).
      • Recovering signals of ghost archaic introgression in African populations, Science Advances, 10.1126/sciadv.aax5097, 6, 7, (eaax5097), (2020).
      • Quaternary climate changes as speciation drivers in the Amazon floodplains, Science Advances, 10.1126/sciadv.aax4718, 6, 11, (eaax4718), (2020).
      • The genetic legacy of extreme exploitation in a polar vertebrate, Scientific Reports, 10.1038/s41598-020-61560-8, 10, 1, (2020).
      • Climatic dynamics and topography control genetic variation in Atlantic Forest montane birds, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106812, (106812), (2020).
      • Panmixia across elevation in thermally sensitive Andean dung beetles, Ecology and Evolution, 10.1002/ece3.6185, 10, 9, (4143-4155), (2020).
      • Genomic insights into historical population dynamics, local adaptation, and climate change vulnerability of the East Asian Tertiary relict Euptelea (Eupteleaceae), Evolutionary Applications, 10.1111/eva.12960, 13, 8, (2038-2055), (2020).
      • Uncertainty calibration of building energy models by combining approximate Bayesian computation and machine learning algorithms, Applied Energy, 10.1016/j.apenergy.2020.115025, 268, (115025), (2020).
      • Genomic analysis of the natural history of attention-deficit/hyperactivity disorder using Neanderthal and ancient Homo sapiens samples, Scientific Reports, 10.1038/s41598-020-65322-4, 10, 1, (2020).
      • Biased-corrected richness estimates for the Amazonian tree flora, Scientific Reports, 10.1038/s41598-020-66686-3, 10, 1, (2020).
      • Considering Genomic Scans for Selection as Coalescent Model Choice, Genome Biology and Evolution, 10.1093/gbe/evaa093, 12, 6, (871-877), (2020).
      • Paternal Origins and Migratory Episodes of Domestic Sheep, Current Biology, 10.1016/j.cub.2020.07.077, (2020).
      • Simple Adaptive Rules Describe Fishing Behaviour Better than Perfect Rationality in the US West Coast Groundfish Fishery, Ecological Economics, 10.1016/j.ecolecon.2019.106449, 169, (106449), (2020).
      • Genetic diversity, structure, and demography of Pandanus boninensis (Pandanaceae) with sea drifted seeds, endemic to the Ogasawara Islands of Japan: Comparison between young and old islands, Molecular Ecology, 10.1111/mec.15383, 29, 6, (1050-1068), (2020).
      • Independent domestication events in the blue‐cheese fungus Penicillium roqueforti, Molecular Ecology, 10.1111/mec.15359, 29, 14, (2639-2660), (2020).
      • The Early Peopling of the Philippines based on mtDNA, Scientific Reports, 10.1038/s41598-020-61793-7, 10, 1, (2020).
      • Population genomics of Vibrionaceae isolated from an endangered oasis reveals local adaptation after an environmental perturbation, BMC Genomics, 10.1186/s12864-020-06829-y, 21, 1, (2020).
      • A simulation method to infer tree allometry and forest structure from airborne laser scanning and forest inventories, Remote Sensing of Environment, 10.1016/j.rse.2020.112056, 251, (112056), (2020).
      • Senescence and entrenchment in evolution of amino acid sites, Nature Communications, 10.1038/s41467-020-18366-z, 11, 1, (2020).
      • East Asian origin of the widespread alpine snow‐bed herb, Primula cuneifolia (Primulaceae), in the northern Pacific region, Journal of Biogeography, 10.1111/jbi.13918, 47, 10, (2181-2193), (2020).
      • Strict allopatric speciation of sky island Pyrrhula erythaca species complex, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106941, 153, (106941), (2020).
      • Building an adaptive trait simulator package to infer parametric diffusion model along phylogenetic tree, MethodsX, 10.1016/j.mex.2020.100978, 7, (100978), (2020).
      • Rapid homoploid hybrid speciation in British gardens: The origin of Oxford ragwort (Senecio squalidus), Molecular Ecology, 10.1111/mec.15630, 0, 0, (2020).
      • Young Birds Switch but Old Birds Lead: How Barnacle Geese Adjust Migratory Habits to Environmental Change, Frontiers in Ecology and Evolution, 10.3389/fevo.2019.00502, 7, (2020).
      • Using approximate Bayesian inference for a “steps and turns” continuous-time random walk observed at regular time intervals, PeerJ, 10.7717/peerj.8452, 8, (e8452), (2020).
      • Inference of coevolutionary dynamics and parameters from host and parasite polymorphism data of repeated experiments, PLOS Computational Biology, 10.1371/journal.pcbi.1007668, 16, 3, (e1007668), (2020).
      • Genetic consequences of being a dwarf: do evolutionary changes in life-history traits influence gene flow patterns in populations of the world’s smallest goldenrod?, Annals of Botany, 10.1093/aob/mcaa062, (2020).
      • Confronting an individual-based simulation model with empirical community patterns of grasslands, PLOS ONE, 10.1371/journal.pone.0236546, 15, 7, (e0236546), (2020).
      • Recent Common Origin, Reduced Population Size, and Marked Admixture Have Shaped European Roma Genomes, Molecular Biology and Evolution, 10.1093/molbev/msaa156, (2020).
      • River capture or ancestral polymorphism: an empirical genetic test in a freshwater fish using approximate Bayesian computation, Biological Journal of the Linnean Society, 10.1093/biolinnean/blaa140, (2020).
      • Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation, Molecular Ecology Resources, 10.1111/1755-0998.13224, 0, 0, (2020).
      • The Spatial Signature of Introgression After a Biological Invasion With Hybridization, Frontiers in Ecology and Evolution, 10.3389/fevo.2020.569620, 8, (2020).
      • Marginally-calibrated deep distributional regression, Journal of Computational and Graphical Statistics, 10.1080/10618600.2020.1807996, (1-41), (2020).
      • Twin introductions by independent invader mussel lineages are both associated with recent admixture with a native congener in Australia, Evolutionary Applications, 10.1111/eva.12857, 13, 3, (515-532), (2019).
      • Process‐based species delimitation leads to identification of more biologically relevant species*, Evolution, 10.1111/evo.13878, 74, 2, (216-229), (2019).
      • Do different rates of gene flow underlie variation in phenotypic and phenological clines in a montane grasshopper community?, Ecology and Evolution, 10.1002/ece3.5961, 10, 2, (980-997), (2019).
      • Leveraging whole genome sequencing data for demographic inference with approximate Bayesian computation, Molecular Ecology Resources, 10.1111/1755-0998.13092, 20, 1, (125-139), (2019).
      • A strong east–west Mediterranean divergence supports a new phylogeographic history of the carob tree (Ceratonia siliqua, Leguminosae) and multiple domestications from native populations, Journal of Biogeography, 10.1111/jbi.13726, 47, 2, (460-471), (2019).
      • Capturing juvenile tree dynamics from count data using Approximate Bayesian Computation, Ecography, 10.1111/ecog.04824, 43, 3, (406-418), (2019).
      • Approaches for the evaluation of favorable shale gas areas and applications: Implications for China's exploration strategy, Energy Science & Engineering, 10.1002/ese3.531, 8, 2, (270-290), (2019).
      • Analysing Cultural Frequency Data: Neutral Theory and Beyond, Handbook of Evolutionary Research in Archaeology, 10.1007/978-3-030-11117-5, (83-108), (2019).
      • Approximate Bayesian Computation, Annual Review of Statistics and Its Application, 10.1146/annurev-statistics-030718-105212, 6, 1, (379-403), (2019).
      • Clonal replacement and heterogeneity in breast tumors treated with neoadjuvant HER2-targeted therapy, Nature Communications, 10.1038/s41467-019-08593-4, 10, 1, (2019).
      • Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania, Nature Communications, 10.1038/s41467-018-08089-7, 10, 1, (2019).
      • Detecting within-host interactions from genotype combination prevalence data, Epidemics, 10.1016/j.epidem.2019.100349, (2019).
      • Yam genomics supports West Africa as a major cradle of crop domestication, Science Advances, 10.1126/sciadv.aaw1947, 5, 5, (eaaw1947), (2019).
      • A GWAS in Latin Americans highlights the convergent evolution of lighter skin pigmentation in Eurasia, Nature Communications, 10.1038/s41467-018-08147-0, 10, 1, (2019).
      • Phylogeographic variation within the Buff-browed Foliage-gleaner (Aves: Furnariidae: Syndactyla rufosuperciliata) supports an Andean-Atlantic forests connection via the Cerrado, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.01.011, (2019).
      • Evolution and disappearance of sympatric Coregonus albula in a changing environment—A case study of the only remaining population pair in Sweden, Ecology and Evolution, 10.1002/ece3.5745, 9, 22, (12727-12753), (2019).
      • A Bayesian approach for the analysis of error rate studies in forensic science, Forensic Science International, 10.1016/j.forsciint.2019.110047, (2019).
      • Multiple freshwater invasions of the tapertail anchovy (Clupeiformes: Engraulidae) of the Yangtze River, Ecology and Evolution, 10.1002/ece3.5708, 9, 21, (12202-12215), (2019).
      • Identifying models of trait‐mediated community assembly using random forests and approximate Bayesian computation, Ecology and Evolution, 10.1002/ece3.5773, 9, 23, (13218-13230), (2019).
      • Pre-Quaternary diversification and glacial demographic expansions of Cardiocrinum (Liliaceae) in temperate forest biomes of Sino-Japanese Floristic Region, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.106693, (106693), (2019).
      • Phylogeography of the specialist plant Mandirola hirsuta (Gesneriaceae) suggests ancient habitat fragmentation due to savanna expansion, Flora, 10.1016/j.flora.2019.151522, (151522), (2019).
      • On the road: Postglacial history and recent expansion of the annual Atriplex tatarica in Europe, Journal of Biogeography, 10.1111/jbi.13687, 46, 11, (2609-2621), (2019).
      • The roles of vicariance and dispersal in the differentiation of two species of the Rhinella marina species complex, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.106723, (106723), (2019).
      • Dynamics of genomic change during evolutionary rescue in the seed beetle Callosobruchus maculatus, Molecular Ecology, 10.1111/mec.15085, 28, 9, (2136-2154), (2019).
      • Individual and temporal variation in pathogen load predicts long‐term impacts of an emerging infectious disease, Ecology, 10.1002/ecy.2613, 100, 3, (2019).
      • Conformity bias in the cultural transmission of music sampling traditions, Royal Society Open Science, 10.1098/rsos.191149, 6, 9, (191149), (2019).
      • Evolutionary Dynamics in Structured Populations Under Strong Population Genetic Forces, G3&#58; Genes|Genomes|Genetics, 10.1534/g3.119.400605, 9, 10, (3395-3407), (2019).
      • Phylotranscriptomics resolves interspecific relationships and indicates multiple historical out-of-North America dispersals through the Bering Land Bridge for the genus Picea (Pinaceae), Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.106610, (106610), (2019).
      • Approximate Bayesian Computation for infectious disease modelling, Epidemics, 10.1016/j.epidem.2019.100368, (100368), (2019).
      • The Current Genomic Landscape of Western South America: Andes, Amazonia, and Pacific Coast, Molecular Biology and Evolution, 10.1093/molbev/msz174, (2019).
      • Dispersal and local persistence shape the genetic structure of a widespread Neotropical plant species with a patchy distribution, Annals of Botany, 10.1093/aob/mcz105, (2019).
      • An ABC Method for Whole-Genome Sequence Data: Inferring Paleolithic and Neolithic Human Expansions, Molecular Biology and Evolution, 10.1093/molbev/msz038, (2019).
      • Signature of the Paleo-Course Changes in the São Francisco River as Source of Genetic Structure in Neotropical Pithecopus nordestinus (Phyllomedusinae, Anura) Treefrog, Frontiers in Genetics, 10.3389/fgene.2019.00728, 10, (2019).
      • Gradual Distance Dispersal Shapes the Genetic Structure in an Alpine Grasshopper, Genes, 10.3390/genes10080590, 10, 8, (590), (2019).
      • Speciation and subsequent secondary contact in two edaphic endemic primroses driven by Pleistocene climatic oscillation, Heredity, 10.1038/s41437-019-0245-8, (2019).
      • Quantitative evidence for early metastatic seeding in colorectal cancer, Nature Genetics, 10.1038/s41588-019-0423-x, (2019).
      • Contact zones and their consequences: hybridization between two ecologically isolated wild Petunia species, Botanical Journal of the Linnean Society, 10.1093/botlinnean/boz022, (2019).
      • Human Migration and the Spread of the Nematode Parasite Wuchereria bancrofti, Molecular Biology and Evolution, 10.1093/molbev/msz116, (2019).
      • Hydrological post-processing based on approximate Bayesian computation (ABC), Stochastic Environmental Research and Risk Assessment, 10.1007/s00477-019-01694-y, (2019).
      • Temporal genomic contrasts reveal rapid evolutionary responses in an alpine mammal during recent climate change, PLOS Genetics, 10.1371/journal.pgen.1008119, 15, 5, (e1008119), (2019).
      • Ancient admixture from an extinct ape lineage into bonobos, Nature Ecology & Evolution, 10.1038/s41559-019-0881-7, (2019).
      • Parallel Speciation of Wild Rice Associated with Habitat Shifts, Molecular Biology and Evolution, 10.1093/molbev/msz029, (2019).
      • Late Pleistocene climate change shapes population divergence of an Atlantic Forest passerine: a model-based phylogeographic hypothesis test, Journal of Ornithology, 10.1007/s10336-019-01650-1, (2019).
      • Population genetic structure and demography of Magnolia kobus: variety borealis is not supported genetically, Journal of Plant Research, 10.1007/s10265-019-01134-6, (2019).
      • Bayesian, Likelihood-Free Modelling of Phenotypic Plasticity and Variability in Individuals and Populations, Frontiers in Genetics, 10.3389/fgene.2019.00727, 10, (2019).
      • Genetic susceptibility to severe childhood asthma and rhinovirus-C maintained by balancing selection in humans for 150 000 years, Human Molecular Genetics, 10.1093/hmg/ddz304, (2019).
      • Demographic Histories and Genome-Wide Patterns of Divergence in Incipient Species of Shorebirds, Frontiers in Genetics, 10.3389/fgene.2019.00919, 10, (2019).
      • Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data, Systematic Biology, 10.1093/sysbio/syz056, (2019).
      • Disentangling the genetic effects of refugial isolation and range expansion in a trans-continentally distributed species, Heredity, 10.1038/s41437-018-0135-5, 122, 4, (441-457), (2018).
      • Out of Africa: demographic and colonization history of the Algerian mouse (Mus spretus Lataste), Heredity, 10.1038/s41437-018-0089-7, 122, 2, (150-171), (2018).
      • Modelling the impact of larviciding on the population dynamics and biting rates of Simulium damnosum (s.l.): implications for vector control as a complementary strategy for onchocerciasis elimination in Africa, Parasites & Vectors, 10.1186/s13071-018-2864-y, 11, 1, (2018).
      • See more