microPop: Modelling microbial populations and communities in R
Abstract
- Microbial communities perform highly dynamic and complex ecosystem functions that impact plants, animals and humans. Here we present an R‐package, microPop, which is a dynamic model based on a functional representation of different microbiota.
- microPop simulates the dynamics and interactions of microbial populations by solving a system of ordinary differential equations that are constructed automatically based on a description of the system.
- Data frames for a number of microbial functional groups (MFG) and default functions for rates of microbial growth, resource uptake, metabolite production are provided but can be modified or replaced by the user.
- microPop can simulate growth in a single compartment (e.g. bio‐reactor) or “compartments” in series (e.g. human colon) or in a simple 1D application (e.g. phytoplankton in a water column). Furthermore, an MFG may contain multiple strains in order to study adaptation and diversity or parameter uncertainty. Also simple interactions between viruses (bacteriophages) and bacteria can be included in microPop.
- microPop is hosted on CRAN and can be installed directly from within R. This paper describes version 1.3 of microPop. The code is also hosted on GitHub for future development (https://github.com/HelenKettle/microPop).
1 INTRODUCTION
(1)
(2)
and
are the inflow and outflow to the system (units of inverse time) for microbes (i = X) and resources (i = R), and Xin(t) and Rin(t) are the incoming quantities of microbes and resources respectively. G(t) is the specific growth rate of microbes on the resource (units of inverse time) and can be expressed in a variety of ways (see Appendix B in Supporting Information). The second term on the right‐hand side of Equation 2 is the uptake rate of the resource due to microbial growth where Y is the yield i.e. the quantity of microbial growth per unit of resource taken up.
When there are multiple resources and several microbial groups with multiple strains then Equations 1 and 2 expand into a large system with multiple metabolic pathways. This is where microPop is a useful tool. Rather than coding these equations, the user simply gives a description of the system (using 2 data frames, “resourceSysInfo” and “microbeSysInfo”) and a data frame for each MFG and these equations are constructed and solved by microPopModel (ODE solvers are provided by the deSolve package (Soetaert, Petzoldt, & Setzer, 2010)).
Data frames for a number of MFGs found in the human large intestine (e.g. Bacteroides, Acetogens, Methanogens, Butyrate Producers, Lactate Producers and so on) as described by Kettle, Louis, Holtrop, Duncan, and Flint (2015) and the rumen (by Munoz‐Tamayo, Giger‐Reverdin, & Sauvant, 2016) are included in the package (Table 1). If the user simply wishes to use these MFGs, then microPop can be used “off the shelf,” however, any number of other MFGs may also be added by the user by defining a data frame in the correct format.
| MFG (Kettle et al., 2015) | Description | Examples |
|---|---|---|
| Bacteroides | Acetate‐propionate‐succinate group | Bacteroides spp. |
| NoButyStarchDeg | Non‐butyrate‐forming starch degraders | Ruminococcaceae related to Ruminococcus bromii. Might also include certain Lachnospiraceae |
| NoButyFibreDeg | Non‐butyrate‐forming fibre degraders | Ruminococcaceae related to Ruminococcus albus, Ruminococcus flavefaciens. Might also include certain Lachnospiraceae |
| LactateProducers | Lactate producers | Actinobacteria, especially Bifidobacterium spp., Collinsella aerofaciens |
| ButyrateProducers1 | Butyrate Producers | Lachnospiraceae related to Eubacterium rectale, Roseburia spp. |
| ButyrateProducers2 | Butyrate Producers | Certain Ruminococcaceae, in particular Faecalibacterium prausnitzii |
| PropionateProducers | Propionate producers | Veillonellaceae e.g. Veillonella spp., Megasphaera elsdenii |
| ButyrateProducers3 | Butyrate Producers | Lachnospiraceae related to Eubacterium hallii, Anaerostipes spp. |
| Acetogens | Acetate Producers | Certain Lachnospiraceae, e.g. Blautia hydrogenotrophica |
| Methanogens | Methanogenic archaea | Methanobrevibacter smithii |
| MFG (Munoz‐Tamayo et al., 2016) | Description | Examples |
| Xsu | Sugar utilizers | |
| Xaa | Amino acid utilizers | |
| Xh2 | Hydrogen utilizers | Methanobrevibacter smithii |
Since many of the required parameter values for the MFGs are not well known, it should be noted that the parameter values stated in the included MFG data frames will almost certainly change with increasing knowledge and in some cases can be interpreted as simply a “best guess.” One way of coping with this parameter uncertainty is addressed in our previous work (Kettle et al., 2015) (and included in microPop) where we assigned multiple strains to each MFG with stochastically generated parameter values. The strains will compete with each other; some will flourish, some will die out, and by the time a steady state is reached a viable microbial community for the given environment will have been created. By changing the seed for the random number generator in microPopModel, multiple viable communities can be created and ensemble statistics can be used to define the solution.
Since microbial growth, resource uptake and metabolite production may be modelled in a number of ways, the choices behind microPop's default growth and uptake rate functions are explained fully in the Appendix. All of these functions are contained in a list called “rateFuncs” (Table 2) and may be redefined by the user (see Appendix A in Supporting Information), allowing microPop to be applied to a large number of different microbial ecosystems. Section 2 gives some examples of what microPop can do and Section 3 gives a brief description of how to use microPop; these sections can be read in either order.
| Function name | Description |
|---|---|
| entryRateFunc | Rate of entry of each state variable to system at time t |
| removalRateFunc | Rate of exit of each state variable from system at time t |
| pHFunc | pH value at time t |
| pHLimFunc | pH limit on growth (varies between 0 and 1 for a given pH value) |
| extraGrowthLimFunc | Another limit on growth (default value is 1 i.e. no limit). This is included to allow the user to add in any kind of growth limitation as its output is used to scale the maxGrowthRate value) |
| growthLimFunc | This scales the maximum growth value (value between 0 and 1) |
| combineGrowthLimFunc | Combining growth on multiple resources |
| uptakeFunc | Uptake of resource due to microbial growth |
| productionFunc | Production of metabolites resulting from microbial growth |
| combinePathsFunc | Combining the results of growth on multiple metabolic pathways |
| createDF | Creates a data frame from a .csv file |
| derivsDefault | Describes the ODEs; called by ode |
| getGroupName | Returns the name of the group from the strain name |
| makeInflowFromSoln | Returns the exit rate of each state variable (matrix[time,variable]) |
| microPopModel | Simulates growth of microbial populations (main function) |
| pHcentreOfMass | Finds the mean pH weighted by the pH limitation |
| plotTraitChange | Plots the average group trait over time (when there are multiple strains per group) |
| runMicroPopExample | Used to run the scripts for the examples described in Section 2 |
2 EXAMPLE APPLICATIONS
Here, we give a flavour of how microPop can be used to simulate a wide range of microbial systems. For more information on these examples please refer to the vignette included with the package (vignette(‘microPop’) in R). The scripts for all of these examples are included in the microPop package11
The location of these files can be found by “system.file(’DemoFiles/ExampleFileName.R’, package=‘microPop’). It is also printed to screen when the script is run.”
and in the supporting information with this paper; they are intended to serve as a template for users when defining their own problems. The name of the appropriate script is given in square brackets in each example heading and they can be run in R, using, for e.g. runMicroPopExample(‘human1’) (for the human1.R script (Section 2.1.1)). Most of the plots shown in this paper are automatically generated by microPop and can be tweaked using the “plotOptions” input list in microPopModel.
2.1 Modelling human gut microbiota
The microbial ecosystem in the human colon has been linked to numerous issues in human health. For example, two important functions are harvesting extra energy from our food, thus warranting the name the “forgotten organ” (O'Hara & Shanahan, 2006), and aiding the development of our immune system (Chow, Lee, Shen, Khosravi, & Mazmanian, 2010). The following four examples are based on the model described by Kettle et al. (2015) which uses 10 different microbial groups to represent the microbial community in the human colon (Table 1). Here we use just three of these—Bacteroides, NoButyStarchDeg (starch degraders that do not produce butyrate) and Acetogens—to demonstrate some features of microPop. It should be noted that although we believe the microbial system described here is representative of that in the human colon, in this application the model is set up to simulate an experiment with a fecal sample in a fermentor system, therefore absorption of short chain fatty acids (SCFA) and other interactions with the host are not included. The information describing the inflows and outflows of each state variable for these scenarios is contained in the data frames “resourceSysInfoHuman” and “microbeSysInfoHuman” which are included with the package and are based on the system described by Kettle et al. (2015) and Walker et al. (2005). To look at these simply type “resourceSysInfoHuman” or “microbeSysInfoHuman” at the R prompt. Since these contain information on all 10 groups used in the full simulation by Kettle et al. (2015), the user can also use these when simulating the behaviour of any/all of the 10 groups.
2.1.1 Microbial growth in a constant environment (human1.R)
This is a simple example to show how microPopModel can be run using most of the default settings and intrinsic dataframes. In this scenario, there is no limit on growth due to pH and the Bacteroides group dominate the system (Figure 1).

2.1.2 How does temporal pH change affect microbial growth? (human2.R)
In this scenario, pH changes from 5.5 to 6.5 halfway through the simulation. This is implemented by altering rateFuncs$pHFunc and setting input argument “pHLimit=TRUE.” Due to their preferred pH ranges (determined by “pHcorners” in the data frames for each group) NoButyStarchDeg now dominate the first half of the simulation, however when the pH rises to 6.5 Bacteroides regain dominance (Figure 2).

2.1.3 How does spatial pH change affect microbial growth? (human3.R)
Here, we approximate the pH change in sections of the human colon by defining the system as two compartments where the first one (at pH 5.5) flows into the second (at pH 6.0). To simulate two compartments, we add a loop to call microPopModel twice. The first call simulates growth in the first compartment over the whole of the simulation time. The results from this are then used to provide the entry rates to the second compartment (using the function makeInputFromSoln) in the second microPopModel call. The results (Figure 3) show that NoButyStarchDeg dominate in first compartment (top row) and Bacteroides begin dominating the second compartment but this changes due to large inflow of NoButyStarchDeg from the previous compartment.

2.1.4 How does microbial diversity affect response to pH? (human4.R)
Here we use the “human2” example, where pH changes from 5.5 to 6.5 halfway through the simulation, but include microbial diversity by assigning five strains to each microbial group (via input argument, “numStrains”). We assume that the strains within a microbial group have the same metabolic pathways i.e. those specified in the group data frame, but diversity is incorporated by randomly varying some of their growth parameters (based on Kettle et al., 2015). The extent of the variation, the parameters which are to be randomised and whether trade‐offs are required are all controlled via the “strainOptions” list. Moreover, the user may also specify the parameter values for individual strains using “paramsSpecified” and “paramDataName” also in this list (note, not all parameter values need to be specified—those that are specified will simply overwrite the randomly generated values). Figure 4a shows the results for each strain.

(3)2.2 Methane production from rumen microbiota (rumen.R)
Methane production through feed fermentation by ruminants contributes significantly to greenhouse gas production by agriculture (Cottle, Nolan, & Weidemann, 2011). Here we use microPop to model in vitro rumen fermentation, based on a simplified version of the model by Munoz‐Tamayo et al. (2016). The construction of this model is significantly different to the human colon model in Section 2.1 in several ways. Firstly, and most importantly, there are no substitutable resource; all resources are essential (see Appendix B.1 for an explanation of the different types of resource) and microbial growth is included explicitly in the group stoichiometries (the groups involved are sugar‐utilisers (Xsu), amino‐acid utilisers (Xaa) and hydrogen utilisers (Xh2); included data frames “Xh2,” “Xsu” and “Xaa”). Secondly, hydrolysis is treated as a separate process such that polymer substrates must be hydrolised to soluble sugars and amino acids before they are available for microbial uptake. Thirdly, dead microbial cells are recycled into polymers.
For demonstration purposes, we have simplifed the original model by Munoz‐Tamayo et al. (2016) as follows: we consider only constituents dissolved in the rumen fluid (thereby removing gas transfer from the fluid to the rumen head space), we have removed carbon chemistry (we only consider dissolved inorganic carbon) and we have removed the calculation of pH from acid‐base reactions. Also, we use units of mass rather than moles. Figure 5 shows a schematic diagram of the system and notation of state variables is given in the figure caption.22
microPop code for the original (unsimplified model) is available on request for academic purposes.

Since polymers are not used directly by any of the microbial groups (and are therefore not mentioned in the MFG data frames) they will not be automatically added as state variables by microPopModel. Thus to include hydrolysis we add Znsc, Zndf and Zpro to the microPop data frame for Xsu. We then explicitly state the parameters needed for hydrolysis and recycling of dead cells into polymers as these are not included in the input files. Furthermore, removalRateFunc is redefined to include the reduction rate for polymers and the entryRateFunc includes the equivalent increase for soluble sugars (Ssu) and soluble amino acids (Saa). Similarly the death of microbial cells is included in removalRateFunc and the increase in polymers from the dead cells is included in entryRateFunc.
Using the same settings as Munoz‐Tamayo et al. (2016), we investigate how increasing the initial concentrations of the feed polymers, Znsc, Zndf and Zpro, affects the concentration of methane in the rumen (Sch4). Thus we set the initial polymer concentrations at 1 g/l and then increase each one in turn to 20 g/l (Figure 6). Increasing Zndf and Zpro leads to increasing methane concentrations as expected, however, the second row in Figure 6 shows that, somewhat counter‐intuitively, the amount of methane produced decreases as initial concentrations of Znsc increases over a threshold between 15 and 20 g/l. SIC (soluble inorganic carbon) and
(soluble hydrogen, not shown) both increase with Znsc, therefore the cause of this appears to be the decrease in ammonia (Snh3) (third column in Figure 6), which rapidly falls to zero for high initial values of Znsc. This is because Znsc is hyrolysed at a much faster rate (0.2 h−1) than Zndf (0.05 h−1) so increased Znsc leads to increased Ssu and rapid growth of Xsu and hence rapid uptake of Snh3. The depletion of Snh3 inhibits the growth of Xh2 and thus mitigates methane production in this simple model example.

2.3 At what depth do phytoplankton grow best? (phyto.R)
(4)
(5)To define this system in microPop, we consider nutrient to be the only resource since light is not depleted through microbial use and therefore does not need to be included as a state variable. Nutrient upwelling is incorporated via entryRateFunc and light limitation via extraGrowthLimFunc (the output from this function is used to scale the maximum growth rate in a similar way to pHLimFunc). There is no wash out rate for resources but we set a small wash out rate for the phytoplankton of 0.005 day−1 (see “systemInfoMicrobesPhyto”) to represent death rate.
We divide a depth of 20 m into 1 m layers and run microPop for each layer for a simulation time of 3 months. The simulation begins with phytoplankton spread evenly through the depth of the water column (e.g. this may occur after vertical mixing caused by high winds). Thereafter, there is no mixing (calm conditions) and the phytoplankton are stationary in the water but grow at different rates according to the light and nutrient levels at that particular depth. In Case 1 (when running runMicroPopExample(‘phyto’) the user will be prompted to enter a case number) we simulate the growth of just one phytoplankton group. Figure 7a shows how the magnitude and depth of the bloom changes.

In Case 2, we add in two more groups, all with the same starting concentration. The three groups have different requirements for nutrient and light as determined by their half saturation values for nutrient and light (KN and KL respectively). Figure 7b shows how over time the groups occupy different levels in the water column.
2.4 Bacteriophages and resistance (phages.R)
Although not the main intended use of microPop, bacteriophages (viruses which attack bacteria) can be included in microPop in a simplistic way. In this example, we consider two (theorectical) groups of bacteria (called Bacteria1 and Bacteria2) and two bacteriophages called Virus1 and Virus2. Both bacteria have the same substrate (nutrient) and the same parameters with the only difference that Bacteria2 has a higher maximum growth rate than Bacteria1. Virus1 attacks Bacteria1, and Virus2 attacks Bacteria2. The two viruses have the same parameter values and differ only in their choice of host cell (bacterial group). We consider a simple system with a constant dilution rate of 0.1 day−1. All variables have a starting value of 1 and the only inflow is nutrient.
(6)
(7)
) per day (fB). Thus the rate of change of
due to mutation is
(8)We run microPop for four different system scenarios (when running runMicroPopExample(‘phages’) the user will be prompted to choose from case 1 to 4); the results are shown in Figure 8. To begin with we look at the system without viruses and see the two bacteria competing for nutrient, since Bacteria2 has the highest growth rate it dominates the system (case 1; Figure 8a). We now add in Virus2 which attacks Bacteria2, allowing Bacteria1 to dominate the system causing Bacteria2, and hence Virus2, to die out (case 2; Figure 8b). If we now add in Virus1, so that we have both bacterial groups and both viral groups, we see more complex dynamics emerge (case 3; Figure 8c). In the fourth case, we add in random mutations from Bacteria1 to resistant Bacteria1 which is resistant to Virus1 and therefore survives at the expense of the other bacterial groups (case 4; Figure 8d).

3 RUNNING microPop
- ‘microbeNames’—a vector of the names of the microbial groups in your system, e.g. c(‘Bacteroides’, ‘Methanogens’). Note that a data frame with the same name must be available for each group specified.
- ‘times’—a vector defining the time sequence at which output is required, e.g. seq(0,10,0.1).
- ‘resourceSysInfo’—this is a data frame or the name of a csv file describing the inflow, outflow, start values and molar masses of the substrates and products associated with the microbial groups specified in microbeNames. See help(resourceSysInfo) for details.
- ‘microbeSysInfo’—this is a data frame or the name of a csv file describing the inflow, outflow and start values of the microbial groups specified in microbeNames. See help(microbeSysInfo) for details.
Figure 9 shows this in detail, using the example given in help(microPopModel). Details of all the input arguments can be found via the function help and in the vignette included with the package.

ACKNOWLEDGEMENTS
We thank the Scottish Goverment's Rural and Environment Science and Analytical Services Division (RESAS) for funding this research. Also many thanks to Rafael Munoz‐Tamayo for his rumen model matlab code and to the three anonymous reviewers.
AUTHORS' CONTRIBUTIONS
H.K. conceived the idea, wrote the package code and led the writing of the manuscript. G.H. contributed extensively to testing and improving the package structure. P.L. and H. J. F. contributed to all aspects of microbiology. All authors contributed critically to the drafts and gave final approval for publication.
DATA ACCESSIBILITY
All code and data is included in microPop v1.3 on github with DOI: https://doi.org/10.5281/zenodo.842797 (Kettle, Holtrop, Louis, & Flint, 2017).
REFERENCES
Citing Literature
Number of times cited according to CrossRef: 6
- Harry J. Flint, Harry J. Flint, Treating the Gut Microbiome as a System, Why Gut Microbes Matter, 10.1007/978-3-030-43246-1_11, (127-135), (2020).
- Thulasi Jegatheesan, Hermann J. Eberl, Modelling the Effects of Antibiotics on Gut Flora Using a Nonlinear Compartment Model with Uncertain Parameters, Computational Science – ICCS 2020, 10.1007/978-3-030-50371-0_29, (399-412), (2020).
- Shui Ping Wang, Luis A. Rubio, Sylvia H. Duncan, Gillian E. Donachie, Grietje Holtrop, Galiana Lo, Freda M. Farquharson, Josef Wagner, Julian Parkhill, Petra Louis, Alan W. Walker, Harry J. Flint, Pivotal Roles for pH, Lactate, and Lactate-Utilizing Bacteria in the Stability of a Human Colonic Microbial Ecosystem, mSystems, 10.1128/mSystems.00645-20, 5, 5, (2020).
- Nick W. Smith, Paul R. Shorten, Eric Altermann, Nicole C. Roy, Warren C. McNabb, Competition for Hydrogen Prevents Coexistence of Human Gastrointestinal Hydrogenotrophs in Continuous Culture, Frontiers in Microbiology, 10.3389/fmicb.2020.01073, 11, (2020).
- Nick W. Smith, Paul R. Shorten, Eric Altermann, Nicole C. Roy, Warren C. McNabb, A Mathematical Model for the Hydrogenotrophic Metabolism of Sulphate-Reducing Bacteria, Frontiers in Microbiology, 10.3389/fmicb.2019.01652, 10, (2019).
- Nick W. Smith, Paul R. Shorten, Eric H. Altermann, Nicole C. Roy, Warren C. McNabb, Hydrogen cross-feeders of the human gastrointestinal tract, Gut Microbes, 10.1080/19490976.2018.1546522, (1-19), (2018).




