Volume 8, Issue 4
Application
Free Access

deBInfer: Bayesian inference for dynamical models of biological systems in R

Philipp H. Boersch‐Supan

Corresponding Author

E-mail address: pboesu@gmail.com

Department of Integrative Biology, University of South Florida, Tampa, FL, 33610 USA

Emerging Pathogens Institute, University of Florida, Gainesville, FL, 32610 USA

Department of Geography, University of Florida, Gainesville, FL, 32611 USA

Correspondence author. E‐mail: pboesu@gmail.comSearch for more papers by this author
Sadie J. Ryan

Emerging Pathogens Institute, University of Florida, Gainesville, FL, 32610 USA

Department of Geography, University of Florida, Gainesville, FL, 32611 USA

Search for more papers by this author
Leah R. Johnson

Department of Integrative Biology, University of South Florida, Tampa, FL, 33610 USA

Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061 USA

Search for more papers by this author
First published: 15 October 2016
Citations: 7

Summary

  1. Understanding the mechanisms underlying biological systems, and ultimately, predicting their behaviours in a changing environment, requires overcoming the gap between mathematical models and experimental or observational data. Differential equations (DEs) are commonly used to model the temporal evolution of biological systems, but statistical methods for comparing DE models to data and for parameter inference are relatively poorly developed. This is especially problematic in the context of biological systems where observations are often noisy and only a small number of time points may be available.
  2. The Bayesian approach offers a coherent framework for parameter inference that can account for multiple sources of uncertainty, while making use of prior information. It offers a rigorous methodology for parameter inference, as well as modelling the link between unobservable model states and parameters, and observable quantities.
  3. We present deBInfer, a package for the statistical computing environment R, implementing a Bayesian framework for parameter inference in DEs. deBInfer provides templates for the DE model, the observation model and data likelihood, and the model parameters and their prior distributions. A Markov chain Monte Carlo (MCMC) procedure processes these inputs to estimate the posterior distributions of the parameters and any derived quantities, including the model trajectories. Further functionality is provided to facilitate MCMC diagnostics, the visualization of the posterior distributions of model parameters and trajectories, and the use of compiled DE models for improved computational performance.
  4. The templating approach makes deBInfer applicable to a wide range of DE models. We demonstrate its application to ordinary and delay DE models for population ecology.

Introduction

The use of differential equations (DEs) to model dynamical systems has a long and fruitful tradition in biological disciplines such as epidemiology, population ecology and physiology (Volterra 1926; Kermack & McKendrick 1927). As DE models are used in an attempt to understand biological systems, it is becoming clear that the simplest models cannot capture the rich variety of dynamics observed in them (Evans et al. 2013). However, more complex models come at the expense of additional states and/or parameters and require more information for parameterization. Further, as most observational data sets contain uncertainty, model identification and fitting become increasingly difficult (Lonergan 2014). Keeping complex models tractable and testable, and linking modelled quantities to data, thus requires statistical methods of similar sophistication. This is particularly relevant in biology, where data series are often short or noisy, and where the scope for observational or experimental replication may be limited.

A vast array of analytical and numerical methods exists for solving DE models as well as exploring their properties and the effect of parameter values on their dynamics (Jones 2003; Smith 2011). In some cases, parameters may be derived from first principles or measured directly, but often some or all parameters cannot be determined by either approach, and it is necessary to estimate them from an observational data set.

Parameter estimation methods for DE models, and their implementation as computational tools, are still less well developed than the aforementioned system dynamics tools and are a topic of active research.

Traditional parameter inference, also known as ‘model calibration’ or ‘solving inverse problems’, has, generally, been based on the maximum‐likelihood principle (Brewer et al. 2008; Aster, Borchers & Thurber 2011), which assumes the existence of a true model urn:x-wiley:2041210X:media:mee312679:mee312679-math-0001 giving rise to a true data set urn:x-wiley:2041210X:media:mee312679:mee312679-math-0002 such that
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0003(eqn 1)
where θ is the parameter set for the model. The additional assumption that the observations urn:x-wiley:2041210X:media:mee312679:mee312679-math-0004 arise from a sum of urn:x-wiley:2041210X:media:mee312679:mee312679-math-0005 and measurement noise that is independently and normally distributed then leads to the least squares solution that is found by minimizing the Euclidian norm of the residual,
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0006(eqn 2)

This approach has been applied to both ordinary differential equations (ODEs) (e.g. Baker et al. 2005) and simple delay‐differential equations (DDEs) (e.g. Horbelt, Timmer & Voss 2002). It allows for point estimates of the parameters, as well as the estimation of normal confidence intervals for the parameters and the correlations between them. However, these error bounds are local in nature and thus offer limited insight into the variability that is to be expected in the model outputs.

Bayesian approaches for parameter estimation in complex, nonlinear models were established early on (e.g. Tarantola & Valette 1982; Poole & Raftery 2000), and they are being applied with increasing frequency to a broad range of biological models (e.g. Coelho, Codeço & Gomes 2011; Voyles et al. 2012; Johnson, Pecquerie & Nisbet 2013; Smith et al. 2015). Recent methodological advances have included the application of Hamiltonian Monte Carlo to ODE models, realized in the software package Stan (Carpenter et al. 2016), particle MCMC methods (Andrieu, Doucet & Holenstein 2010), approximate Bayesian computation (ABC; e.g. Liu & West 2001; Toni et al. 2009) and so‐called plug‐and‐play approaches (e.g. He, Ionides & King 2009). A suite of these methods are implemented in the R package pomp (King et al. 2016). While many statistical approaches, including the one presented here, treat the numerical solution of the DE model as exact, there has also been work towards quantifying the uncertainty contained in the numerical DE solutions themselves (Chkrebtii et al. 2015).

In the Bayesian approach, the model, its parameters and the data are viewed as random variables. This approach to parameter inference is attractive, as it provides a coherent framework that allows the incorporation of uncertainty in the observations and the process, and it relaxes the assumption of normal errors. It provides us not only with full probability distributions describing the parameters, but also with probability distributions for any quantity derived from them, including the model trajectories. Further, the Bayesian framework naturally lets us incorporate prior information about the parameter values. This is particularly useful when there are known biological or theoretical constraints on parameters. For example, many biological parameters, such as body size, cannot take on negative values. Using informative priors can help constrain the parameter space of the estimation procedure, aiding with parameter identifiability.

We explain the rationale behind the Bayesian approach below and describe our implementation of a fitting routine based on a Markov chain Monte Carlo (MCMC) sampler coupled to a numerical DE solver. We illustrate the application of deBInfer to a simple example, the logistic differential equation, and a more complex model of the reproductive life history of the fungal pathogen Batrachochytrium dendrobatidis.

Materials and methods

The purpose of deBInfer is to estimate the probability distribution of the parameters of a user‐specified DE model urn:x-wiley:2041210X:media:mee312679:mee312679-math-0007, given an empirical data set urn:x-wiley:2041210X:media:mee312679:mee312679-math-0008, and accounting for the uncertainty in the data. The model takes the general form
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0009(eqn 3)
where x is a vector of variables evolving with time; f is a functional operator that takes a time input and a vector of continuous functions urn:x-wiley:2041210X:media:mee312679:mee312679-math-0010 and generates the vector urn:x-wiley:2041210X:media:mee312679:mee312679-math-0011 as output; and θ denotes a set of parameters. Further, we define urn:x-wiley:2041210X:media:mee312679:mee312679-math-0012. When all τ ∈ τ = 0, the model is represented by a system of ODEs; when any τ < 0, the model is represented by a system of delay‐differential equations (DDEs). For the purposes of inference, τ is simply a subset of the parameters θ that are to be estimated. deBInfer implements inference for ODEs as well as DDEs with constant delays.
Using Bayes's theorem (Clark 2007), we can calculate the posterior distribution of the model parameters, given the data and the prior information as
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0013(eqn 4)
where Pr() denotes a probability, urn:x-wiley:2041210X:media:mee312679:mee312679-math-0014 denotes the data and θ denotes the set of model parameters. The product in the numerator is the joint distribution, which is made up of the likelihood urn:x-wiley:2041210X:media:mee312679:mee312679-math-0015 or urn:x-wiley:2041210X:media:mee312679:mee312679-math-0016, which gives the probability of observing urn:x-wiley:2041210X:media:mee312679:mee312679-math-0017 given the deterministic model urn:x-wiley:2041210X:media:mee312679:mee312679-math-0018, and the prior distribution Pr(θ), which represents the knowledge about θ before the data were collected. The denominator represents the marginal distribution of the data urn:x-wiley:2041210X:media:mee312679:mee312679-math-0019. Before the data are collected, urn:x-wiley:2041210X:media:mee312679:mee312679-math-0020 is a random variable, but after they are collected, the marginal distribution becomes a fixed quantity. This means, the inferential problem reduces to
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0021(eqn 5)
that is finding a specific proportionality that allows the posterior urn:x-wiley:2041210X:media:mee312679:mee312679-math-0022 to be a proper probability density (or mass) function that integrates to 1.

Closed form solutions for the posterior are practically impossible to obtain for complex nonlinear models with more than a few parameters, but they can be approximated, for example, by combining the MCMC algorithm with a Metropolis–Hastings sampler (Clark 2007). This yields a sequence of likelihoods that follow a frequency distribution which approximates the posterior distribution.

The likelihood urn:x-wiley:2041210X:media:mee312679:mee312679-math-0023 describes the probability of the data for a given realization of the model urn:x-wiley:2041210X:media:mee312679:mee312679-math-0024, and we can use the fact that the data are uncertain to derive an expression like
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0025(eqn 6)
where urn:x-wiley:2041210X:media:mee312679:mee312679-math-0026 is a parametric probability distribution, typically with first and second moments μ and urn:x-wiley:2041210X:media:mee312679:mee312679-math-0027, urn:x-wiley:2041210X:media:mee312679:mee312679-math-0028 is data item t and urn:x-wiley:2041210X:media:mee312679:mee312679-math-0029 is the variance associated with urn:x-wiley:2041210X:media:mee312679:mee312679-math-0030.
Often the data urn:x-wiley:2041210X:media:mee312679:mee312679-math-0031 contain multiple data series, for example time‐course observations of different state variables, following different probability distributions. In this case, the likelihood becomes the product over all series and each data item in each series s
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0032(eqn 7)

Implementation

deBInfer provides a framework for dynamical models consisting of a deterministic DE model and a stochastic observation model. To perform inference using deBInfer, the user must specify R functions or data structures representing the DE model, an observation model and thus the data likelihood and declare all model and observation parameters, including prior distributions for those parameters that are to be estimated. The DE model itself can also be provided as a shared object, for example a compiled C function. deBInfer takes these inputs and performs MCMC to sample from the posterior distributions of parameters, solving the DE model numerically within the MCMC procedure. The MCMC procedure for deBInfer offers independent as well as random‐walk Metropolis–Hastings updates and is implemented fully in R (R Core Team 2015). Background on Metropolis–Hastings MCMC is widely available in the literature (e.g. Clark 2007; Brooks et al. 2011).

As numerically solving the DE model is the most computationally costly step, we made two slight modifications to the basic Metropolis–Hastings algorithms. (i) deBInfer makes a distinction between the parameters of the DE model urn:x-wiley:2041210X:media:mee312679:mee312679-math-0033, and the observation parameters urn:x-wiley:2041210X:media:mee312679:mee312679-math-0034, invoking the solver only for updates of the former, and (ii) the prior probability of each parameter proposal from the random‐walk sampler is evaluated before the posterior density and the acceptance ratio are calculated. This allows the rejection of proposals outside the prior support without invoking the numerical solver. The algorithm is outlined in Table 1.

Table 1. Implementation of the random‐walk Metropolis–Hastings algorithm. The transition from a parameter value urn:x-wiley:2041210X:media:mee312679:mee312679-math-0035 in the Markov chain at step k to its value at step k+1 proceeds via the outlined steps. q is a conditional density, the so‐called proposal distribution
  1. Generate a proposal urn:x-wiley:2041210X:media:mee312679:mee312679-math-0036
  2. Evaluate the prior probability urn:x-wiley:2041210X:media:mee312679:mee312679-math-0037
  3. if urn:x-wiley:2041210X:media:mee312679:mee312679-math-0038

    Let urn:x-wiley:2041210X:media:mee312679:mee312679-math-0039

  4. if urn:x-wiley:2041210X:media:mee312679:mee312679-math-0040

    if urn:x-wiley:2041210X:media:mee312679:mee312679-math-0041: solve the DE model

    Let urn:x-wiley:2041210X:media:mee312679:mee312679-math-0042

deBInfer provides a choice of three proposal distributions q for the first step in the algorithm, a normal urn:x-wiley:2041210X:media:mee312679:mee312679-math-0043, an asymmetric uniform urn:x-wiley:2041210X:media:mee312679:mee312679-math-0044 and a multivariate normal urn:x-wiley:2041210X:media:mee312679:mee312679-math-0045. deBInfer requires manual tuning; that is, the variance components urn:x-wiley:2041210X:media:mee312679:mee312679-math-0046, a and b, and Σ, respectively, are user‐specified inputs. The asymmetric uniform distribution is useful for proposals of parameters that are strictly positive, such as variances, and the multivariate normal is useful for efficiently sampling parameters that are strongly correlated, as is often the case for DE model parameters.

A simple example – logistic population growth

We illustrate the steps needed to perform inference for a DE model, by conducting inference on the logistic model (acknowledging that the existence of a closed form solution to this DE makes this an artificial example):
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0047(eqn 8)
Annotated code to implement this model, simulate observations from it and conduct the inference is provided as a package vignette (Appendix S1, Supporting Information). An overview of the core functions available in deBInfer is provided in Table 2.
Table 2. An overview of the main functions available in deBInfer
Function Description
debinfer_par Creates a data structure representing an individual parameter or initial value of the DE model, or an observation parameter, and the corresponding values, priors, etc.
setup_debinfer Combines multiple parameter declarations into an input object for inference
de_mcmc Conducts MCMC inference on a DE model and returns an object of the class debinfer_result
plot.debinfer_result Plots traces and posterior densities (wrapper for coda::plot.mcmc)
summary.debinfer_result Summary statistics for MCMC samples (wrapper for coda::summary.mcmc)
pairs.debinfer_result Pairwise plots and correlations of marginal posterior distributions
post_prior_densplot Overlay of posterior and prior densities for free parameters
post_sim Simulate posterior trajectories of the DE model and summary statistics thereof
plot.post_sim_list Plot posterior DE model trajectories

Installation

The deBInfer package is available on CRAN. The development version can be installed from github using devtools (Wickham & Chang 2016), which can be installed from CRAN

  • #Install the CRAN release.

  • install.packages("deBInfer")

  • #Alternatively install devtools and the development

  • version of deBInfer.

  • install.packages("devtools")

  • devtools::install_github("pboesu/debinfer")

  • #Load deBInfer.

  • library(deBInfer)

Specification of the differential equation model

deBInfer makes use of the deSolve and PBSddesolve packages (Soetaert, Petzoldt & Setzer 2010; Couture‐Beil et al. 2014) to numerically solve ODE and DDE models. The DE model has to be specified as a function containing the model equations, following the guidelines given in the respective package documentations. For our simple example, the function takes three inputs: time, a vector of time points at which to evaluate the DE; y, a vector containing the initial value for the state variable N; and parms, a vector containing the parameters r and K.

  • logistic_model <-function(time, y, parms) {

  • with(as.list(c(y, parms)), {

  • dN <-r *N*(1-N/K)

  • list(dN)

  • })

  • }

Observation model and likelihood specification

For the purpose of demonstration, we will conduct inference on simulated observations from this model assuming log‐normal noise with a standard deviation urn:x-wiley:2041210X:media:mee312679:mee312679-math-0048. A set of simulated observations is provided with the package and can be loaded with the command data(logistic). The appropriate log‐likelihood takes the form
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0049(eqn 9)
where urn:x-wiley:2041210X:media:mee312679:mee312679-math-0050 are the observations, and urn:x-wiley:2041210X:media:mee312679:mee312679-math-0051 are the predictions of the DE model given the current MCMC sample of the parameters θ. Further, ɛ ≪ 1 is a small correction needed, because the exact DE solution can equal zero (or less, depending on numerical precision of the solver). ɛ should therefore be at least as large as the expected numerical precision of the solver. We chose urn:x-wiley:2041210X:media:mee312679:mee312679-math-0052, which is on the same order as the default numerical precision of the default solver (deSolve::ode with method = “lsoda"), but we found that the inference results were insensitive to this choice as long as ɛ ≤ 0·01 (Appendix S1, Conclusion).

The deBInfer observation model template requires three inputs: a data.frame of observations, data; the simulated trajectory returned by the numerical solver in MCMC procedure, sim.data; and the current sample of the parameters, samp. The user specifies the observation model such that it returns the summed log‐likelihoods of the data. In this example, the observations are in the data.frame column N_noisy, and the corresponding predicted states are in the column N of the matrixlike object sim.data (see Appendix S1).

  • #load example data

  • data(logistic)

  • # user defined data likelihood

  • logistic_obs_model <- function(data, sim.data,

  • samp){

  • epsilon <-1e-6

  • llik <- sum(dlnorm(data $ N_noisy, meanlog = log

  • (sim.data[, "N"]+ epsilon),

  • sdlog = samp[["sdlog.N"]], log = TRUE))

  • return(llik)

  • }

Parameter, prior and sampler specification

All parameters that are used in the DE model and the observation model need to be declared for the inference procedure using the debinfer_par() function. The declaration describes the variable name, whether it is a DE or observation parameter and whether or not it is to be estimated. If the parameter is to be estimated, the user also needs to specify a prior distribution and a number of additional parameters for the MCMC procedure. deBInfer currently supports priors from all probability distributions implemented in base R, as well as their truncated variants, as implemented in the truncdist package (Novomestky & Nadarajah 2012).

We declare the DE model parameter r, assign a prior urn:x-wiley:2041210X:media:mee312679:mee312679-math-0053 and a random‐walk sampler with a Normal kernel (samp.type=“rw") and proposal variance of 0·005 with the command

  • r <-debinfer_par(name = "r", var.type = "de", fixed =

  • FALSE, value = 0.5, prior = "norm", hypers = list(mean

  • = 0, sd = 1), prop.var = 0.005, samp.type = "rw")

Similarly, we declare urn:x-wiley:2041210X:media:mee312679:mee312679-math-0054 and urn:x-wiley:2041210X:media:mee312679:mee312679-math-0055.

  • K <-debinfer_par(name = "K", var.type = "de", fixed =

  • FALSE, value = 5, prior = "lnorm", hypers = list

  • (meanlog = 1, sdlog = 1), prop.var = 0.1, samp.type =

  • "rw")

  • sdlog.N <-debinfer_par(name = "sdlog.N", var.type =

  • "obs", fixed = FALSE, value = 0.1, prior = "lnorm",

  • hypers = list(meanlog = 0, sdlog = 1), prop.var= c

  • (3,4), samp.type = "rw-unif")

Note that we are using the asymmetric uniform proposal distribution for the variance parameter (samp.type="rwunif"), as this ensures strictly positive proposals. Lastly, we provide an initial value urn:x-wiley:2041210X:media:mee312679:mee312679-math-0056 = 0·1 for the DE:

  • N <-debinfer_par(name = "N", var.type = "init", fixed =

  • TRUE, value = 0.1)

MCMC inference

The MCMC procedure is called using the function de_mcmc() which takes the declared parameters, the DE and observational models, the data and further optional arguments to the MCMC procedure and/or the solver as inputs and returns an array containing the resulting MCMC samples.

All declared parameters are collated using setup_debinfer()

  • mcmc.pars <-setup_debinfer(r, K, sdlog.N, N)

and passed to de_mcmc() which is set to use deSolve::ode() as a back end in this case, as specified by the argument solver="ode"

  • # do inference with deBInfer

  • # MCMC iterations

  • iter <-5000

  • # inference call

  • mcmc_samples <-de_mcmc(N = iter, data = logistic,

  • de.model = logistic_model,

  • obs.model = logistic_obs_model, all.params =

  • mcmc.pars,

  • Tmax = max(logistic$time), data.times = logistic

  • $time,

  • cnt = 500, plot = FALSE, solver = "ode")

Inference outputs

The inference function returns an object of class debinfer_result, which contains the posterior samples in a format compatible with the coda package (Plummer et al. 2006), as well as the DE and observation models and all parameters used for inference. This allows the use of the diagnostic functions and plotting routines provided in coda (see Fig. 1). We also provide additional functions and methods such as pairs.debinfer_result()to create pairwise plots of the marginal posterior distributions, which show correlations between individual parameters (see Fig. 2), post_prior_densplot(), which allows a visual comparison between prior and marginal posterior densities for each parameter, and post_sim(), which simulates posterior model trajectories and associated credible intervals, as well as plotting methods for the latter (see Fig. 3).

image
Markov chain Monte Carlo traces and posterior density plots for the logistic model. Figures like this one can be created using plot.debinfer_result.
image
Pairwise plot of the marginal posterior distributions. This figure was created using pairs.debinfer_result.
image
Posterior model trajectory (median with 95% highest posterior density interval), created with plot.post_sim_list, and the data points used for fitting.

Example application – DDE model of fungal population growth

To illustrate applications of deBInfer beyond the simplistic example above, we outline inference procedures for a more complex model and corresponding observational data. Full model details and annotated code can be found in Appendix S2. Our example demonstrates parameter inference for a DDE model of population growth in the environmentally sensitive fungal pathogen Batrachochytrium dendrobatidis (Bd), which causes the amphibian disease chytridiomycosis (Rosenblum et al. 2010; Voyles et al. 2012). This model has been used to further our understanding of pathogen responses to changing environmental conditions. Further details about the model development, and the experimental procedures yielding the data used for parameter inference, can be found in Voyles et al. 2012.

The model follows the dynamics of the concentration of an initial cohort of zoospores, C, the concentration of zoospore‐producing sporangia, S, and the concentration of zoospores in the next generation Z. The initial cohort of zoospores, C, starts at a known concentration, and zoospores in this initial cohort settle and become sporangia at rate urn:x-wiley:2041210X:media:mee312679:mee312679-math-0057, or die at rate urn:x-wiley:2041210X:media:mee312679:mee312679-math-0058. urn:x-wiley:2041210X:media:mee312679:mee312679-math-0059 is the fraction of sporangia that survive to the zoospore‐producing stage. We assume that it takes a minimum of urn:x-wiley:2041210X:media:mee312679:mee312679-math-0060 days before the sporangia produce zoospores, after which they produce zoospores at rate η. Zoospore‐producing sporangia die at rate urn:x-wiley:2041210X:media:mee312679:mee312679-math-0061. The concentration of zoospores, Z, is the only state variable measured in the experiments, and it is assumed that these zoospores settle (urn:x-wiley:2041210X:media:mee312679:mee312679-math-0062) or die (urn:x-wiley:2041210X:media:mee312679:mee312679-math-0063) at the same rates as the initial cohort of zoospores. The equations that describe the population dynamics are as follows:
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0064(eqn 10)
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0065(eqn 11)
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0066(eqn 12)
Because the observations are counts of zoospores (i.e. discrete numbers), we assume that observations of the system at a set of discrete times urn:x-wiley:2041210X:media:mee312679:mee312679-math-0067 are independent Poisson random variables with a mean given by the solution of the DDE, at times urn:x-wiley:2041210X:media:mee312679:mee312679-math-0068. The log‐likelihood of the data given the parameters, underlying model and initial conditions is then a sum over the n observations at each time point in urn:x-wiley:2041210X:media:mee312679:mee312679-math-0069
urn:x-wiley:2041210X:media:mee312679:mee312679-math-0070(eqn 13)
In this case, we conduct inference using deSolve::dede() as the back end to de_mcmc. The marginal posteriors of the estimated parameters are presented in Fig. 4, and posterior trajectories for the model are presented in Fig. 5.
image
Comparison of marginal posterior densities (black) and the corresponding priors (red) of the estimated parameters of the chytrid model. This plot was created using post_prior_densplot.
image
Posterior trajectories for each state variable of the chytrid model based on 1000 model simulations from the posterior of the parameters and the data points urn:x-wiley:2041210X:media:mee312679:mee312679-math-0071 used for fitting.

Known limitations

The MCMC sampler is implemented in R, which makes it considerably slower than samplers written in compiled languages, for example those underlying packages such as Stan (Carpenter et al. 2016) or Filzbach (Purves & Lyutsarev 2016). For inference conducted purely in R, the computational bottleneck is solving the DE model numerically. However, even for relatively simple models, a 5‐ to 10‐fold speedup of the inference procedure can be achieved using compiled DE models (see Appendix S3). Furthermore, the debinfer MCMC algorithm is not adaptive and requires manual tuning. Lastly, sampling using the Metropolis–Hastings MCMC algorithm itself can be inefficient in the presence of strong parameter correlations. Alternative approaches such as Hamiltonian MC (Carpenter et al. 2016) or particle‐filtering methods (e.g. King et al. 2016) may offer more efficient means for parameter estimation in ODEs in these cases. Nonetheless, the package is able to fit real‐world problems in a matter of minutes to hours on current desktop hardware, which is acceptable for many applications, while providing flexible inference for both ODE and DDE models.

Conclusion

Understanding the mechanisms underlying biological systems, and ultimately, predicting their behaviours in a changing environment, requires overcoming the gap between mathematical models and experimental or observational data. We believe that Bayesian inference provides a powerful tool for fitting dynamical models and selecting between competing models. The deBInferR package provides a suite of tools to this end in a programming language that is widespread in many biological disciplines. We hope that our package will lower the hurdle to the uptake of this inference approach for empirical biologists. We encourage users to report bugs and provide other feedback on the project issue page: https://github.com/pboesu/debinfer/issues

Authors' contributions

L.R.J. conceived the methodology and wrote the initial R implementation; P.H.B.S. re‐implemented the methodology as an R package; P.H.B.S. and S.J.R. wrote the package documentation; and P.H.B.S. led the writing of the manuscript. All authors tested the software, contributed critically to the drafts and gave final approval for publication.

    Acknowledgements

    The authors thank Richard FitzJohn and two anonymous reviewers for their constructive comments on earlier versions of the code and manuscript. All authors were supported by the US National Science Foundation (Grant PLR‐1341649). The authors also thank Jamie Voyles for sharing the chytrid growth data. The authors have no conflict of interest to declare.

      Data accessibility

      All code and data used in this article are included in the deBInfer package and its vignettes, which are freely available from CRAN: https://cran.r-project.org/package=deBInfer. The development version of the package is available at https://github.com/pboesu/debinfer.

        Number of times cited according to CrossRef: 7

        • Building New Models: Rethinking and Revising ODE Model Assumptions, An Introduction to Undergraduate Research in Computational and Mathematical Biology, 10.1007/978-3-030-33645-5_1, (1-86), (2020).
        • The impact of within-vector parasite development on the extrinsic incubation period, Royal Society Open Science, 10.1098/rsos.192173, 7, 10, (192173), (2020).
        • Disease‐structured N‐mixture models: A practical guide to model disease dynamics using count data, Ecology and Evolution, 10.1002/ece3.4849, 9, 2, (899-909), (2019).
        • Estimating Parameters From Multiple Time Series of Population Dynamics Using Bayesian Inference, Frontiers in Ecology and Evolution, 10.3389/fevo.2018.00234, 6, (2019).
        • Fast and slow advances toward a deeper integration of theory and empiricism, Theoretical Ecology, 10.1007/s12080-019-00441-x, (2019).
        • Two case studies detailing Bayesian parameter inference for dynamic energy budget models, Journal of Sea Research, 10.1016/j.seares.2018.07.014, (2018).
        • Fitting functional responses: Direct parameter estimation by simulating differential equations, Methods in Ecology and Evolution, 10.1111/2041-210X.13039, 9, 10, (2076-2090), (2018).