Estimating interaction strengths for diverse horizontal systems using performance data
Abstract
- Network theory allows us to understand complex systems by evaluating how their constituent elements interact with one another. Such networks are built from matrices which describe the effect of each element on all others. Quantifying the strength of these interactions from empirical data can be difficult, however, because the number of potential interactions increases nonlinearly as more elements are included in the system, and not all interactions may be empirically observable when some elements are rare.
- We present a novel modelling framework which uses measures of species performance in the presence of varying densities of their potential interaction partners to estimate the strength of pairwise interactions in diverse horizontal systems.
- Our method allows us to directly estimate pairwise effects when they are statistically identifiable and to approximate pairwise effects when they would otherwise be statistically unidentifiable. The resulting interaction matrices can include positive and negative effects, the effect of a species on itself, and allows for non-symmetrical interactions.
- We show how to link the parameters inferred by our framework to a population dynamics model to make inferences about the effect of interactions on community dynamics and diversity.
- The advantages of these features are illustrated with a case study on an annual wildflower community of 22 focal and 52 neighbouring species, and a discussion of potential applications of this framework extending well beyond plant community ecology.
1 INTRODUCTION
In many biological systems, interactions between system elements (be these species, individuals, etc.) affect population-level performance and together determine the dynamics of the whole system. To understand system dynamics when multiple system elements are involved, complex systems can be represented as networks where the elements are nodes and linked by interactions (Pimm & Lawton, 1978). These nodes can take on a wide array of identities, including cells, individuals, populations or species. Likewise, interactions or links can operate via many different mechanisms and have a wide range of effects on the nodes. Network theory has been widely applied to investigate different biological systems. It has been particularly effective at informing us of the underlying biological processes structuring diverse multi-trophic communities (Dunne et al., 2002; Thompson et al., 2012) through the study of vertical interactions in food webs, plant-pollinator and host–parasite systems (Cirtwill & Stouffer, 2015; Lafferty et al., 2008; Stouffer et al., 2014).
Horizontal networks, however, where interactions occur within the same level of organisation (for example interactions between plants belonging to the same food web; Vellend, 2016) have been more neglected by network ecology (Ellison, 2019). In such systems, interactions between species are not always easy to directly observe empirically and must instead be deduced through other means. A common approach in population ecology is to directly quantify the effects of interactions on a species of interest by evaluating performance in the absence and presence of potential interaction partners at fixed or varying densities (Connell, 1961; Grace & Tilman, 1990). ‘Performance’ here refers to any variable that affects the dynamics of the system, for example quantity of resources gathered, biomass accumulation, or population growth rate. Measuring species interactions as effects on performance allows us to infer future population trajectories through their connection to population dynamics models, and thus helps draw direct conclusions about the effects of interactions on emergent diversity patterns (Laska & Wootton, 1998). The resulting interactions are phenomenological and thus not dependent on any specific mechanism, allowing us to capture a wide range of biological processes affecting the dynamics of the whole system (Novak & Wootton, 2010). Such methods can quickly become data intensive and computationally complex, however, as the number of species increases and the number of potential direct interactions subsequently increases as . Highly diverse systems pose a further challenge: the abundance distribution of different species is typically skewed, with a few species making up the majority of abundances and a large number of elements remaining rare (Fisher et al., 1943). Given that data collection is limited in time and scope, interactions with rarer species may not be observed simply by chance. We thus run the risk of excluding them from analyses regardless of the role they might play (Olesen et al., 2011). Empirically quantifying interaction matrices for diverse horizontal systems thus requires a method that is flexible to both a high number of species, and potential gaps in our records of interactions.
Various methods have been developed to circumvent these issues. A common strategy is to reduce the number of parameters to be estimated by assuming most interactions are weak enough to be negligible, and thus priority is given to inference of only the strongest interactions (Weiss-Lehman et al., 2022). Other approaches include averaging interactions across species by aggregating species together in groups, for example based on their origin and life form (Martyn et al., 2021), by their taxonomy and traits (Uriarte et al., 2004), or lumping all heterospecific species together (Chu & Adler, 2015). Here, we present a alternative approach designed to make the most of available data without requiring such strong a priori hypotheses. Specifically, we develop a joint model that allows us to estimate both identifiable and unidentifiable pairwise interactions from measures of performance in the absence and presence of different interaction partners.
We present a general framework to estimate interactions in diverse horizontal systems. We implement the model in STAN (Carpenter et al., 2017), a Bayesian statistical language, and apply it to an ecological case study of an annual wildflower community in Western Australia. Using this dataset, we estimate positive and negative interactions between 22 focal species and 52 neighbouring species and illustrate a range of uses for this approach through ecologically relevant findings. We further describe how to couch the returned interaction estimates into established models of population dynamics, thus allowing inferences to be drawn between the structure and nature of the interaction matrix and patterns of community abundances and biodiversity. This framework presents a new and exciting way to make use of data that would otherwise be too incomplete to uniquely infer all pairwise interactions. We also provide model code in R and STAN which requires little to no modification for immediate application to a wide variety of datasets of species performance.
2 MATERIALS AND METHODS
We developed a joint modelling framework to estimate pairwise interactions which benefits from several distinguishing features including the ability to estimate both identifiable interactions (direct estimates from the observed data) and unidentifiable interactions (when observations are missing or too few). After describing the required data (Section 2.1), we show how one can estimate identifiable interactions with a unique interaction parameter as described in the neighbour-density dependent model (NDDM; Section 2.2). We then define and select which interactions are identifiable and which are not (Section 2.3) based on data availability. Independent from the first model, we also describe a response–impact model (Section 2.4) where species have a singular effect on neighbours and a singular response. This allows us to estimate unidentifiable interactions as the product of element-specific response and impact parameters. Both models contribute to the overall joint model likelihood as detailed in Section 2.5. Together, identifiable and unidentifiable interaction estimates can then populate community interaction matrices that describe the effects of all interaction partners on the performance of all focal species.
2.1 Data requirements
The joint model framework was initially developed for an ecological dataset where interacting elements (species) affect each other's performance (lifetime reproductive success). Though we refer to system elements as species throughout this paper, this framework can be applied to data from any interacting group of elements (e.g. cells, individuals, populations, species) which meet the following criteria. First, observations must include some proxy for performance, such as growth (e.g. biomass), fecundity (e.g. number of eggs laid) or chemical production (e.g. oxygen). Second, these observations must also record the identities and densities of elements which potentially interact with each focal element. Lastly, observations should be replicated for each focal element with the aim of capturing variation in the identities and densities of interaction partners. Though not strictly necessary, it is also beneficial to have observations of focal elements with no interaction partners to better estimate intrinsic performance.
We define as the number of focal species , , and as the number of interacting species across all focals, . Typically, not all species in the system will be represented in the set of focals, such that . Measurements of the performance of individual units from each focal element (e.g. seed production of individual plants belonging to a set of focal species) are stored in a vector of length and indexed by , with . The densities of interaction partners are stored in a matrix , of size . When an element was absent for a given observation , then . Finally, the species identity of each of the focal individuals is stored in the vector , of length , and containing the index of the corresponding species: .
2.2 Neighbour-density dependent model
We quantify the strengths of interactions by regressing the performance of a species (alternatively, a population or any chosen set of replicated units) against the density and identity of other interacting species in a NDDM. Increases or decreases in a species' performance are thus attributed to the changing densities of its interaction partners.
The parameters capture the effect of each species on whereas the intercept represents intrinsic performance (in the link scale), a species' performance in the absence of interactions or when all interaction effects are . Note that interacting species can include members of focal species itself, in which case intraspecific interactions are captured by the parameter . Furthermore, the equation above places no restrictions on the sign of : interactions can be harmful to the focal species (competitive) or beneficial (facilitative). The negative sign in front of does, however, mean that positive interaction estimates should be interpreted as competitive and negative estimates as facilitative. Note that because is the focal species index for each observation, parameters making use of the subscript can use an subscript instead when they are no longer linked to a specific observation of performance.
We implement this model, as well as the RIM described below (in Section 2.4), as generalised linear models. This means that the relationship between and the right-hand side of the equation does not need to be linear. This can easily be changed by choosing an appropriate link function for the data in question. In our case study, we use the link function to model our response variable as a negative-binomial variate.
2.3 Defining identifiable and unidentifiable interaction parameters
A common issue in observational datasets is that some species or elements are not observed interacting with other species or interacting at sufficiently variable densities because data sampling is limited. This can create a situation in which we cannot estimate all potential interaction parameters () in the NDDM specified above, especially for rare species . For an interaction parameter to be identifiable, and thus inferrable by the NDDM, data must contain measurements of the performance of when interacting with at varying densities. Moreover, the vector of densities of associated with measurements of focal must be linearly independent of all other vectors of densities of species interacting with . For example, if two species and interact with focal yet have the same density or equally proportional densities at every measurement, neither nor is inferrable by the NDDM. We define identifiable interactions as those which are inferrable following the above assumptions, and unidentifiable interactions as those which are not. Given an empirical dataset, we construct a matrix of size , with if the corresponding parameter is identifiable, and if not. We describe our verification of linear independence between vectors of neighbour densities and the construction of the matrix in the Supporting Information S1.2 and in the GitHub repository.
2.4 Unidentifiable parameters and the response–impact model
To fit the RIM, the densities of species must be linearly independent to the densities of other species and/or combinations of species, across the entire matrix. This is in contrast to Section 2.2 above, where the densities of must be linearly independent within the subset of observations for each focal . Importantly, this is a less strict condition for parameter identifiability. As a result, it is frequently possible to estimate pairwise interactions that would be unidentifiable given Equation (1) by recognising that pairwise density dependence in Equation (2) is given by .
Readers should note that the response–impact model makes different assumptions about the structure of interactions in nature. Specifically it supposes that species tend to have a generalisable effect on and response to neighbours regardless of neighbour identity. This stands in contrast to the NDDM, which assumes a unique parameter for every interaction thereby allowing for more idiosyncrasies across species. Such simplification is useful from a statistical point of view in order to infer unidentifiable interactions, but its ecological significance should not be dismissed. Though there is evidence in support of this approach (Skwara et al., 2022; Stouffer et al., 2022), care should always be taken to assess the different assumptions behind the estimation of identifiable and unidentifiable interactions and how well these may apply to any particular system of interest.
2.5 Joining the two models
For a hypothetical dataset with which all interactions are identifiable, then and fitting the RIM is not strictly necessary. The parameter vectors and would still be estimated, but they are independent of the parameters in and hence maximisation of both likelihoods is independent. Conversely, if no interactions are identifiable then the joint model devolves to the RIM only. In the middle ground—where the majority of datasets are likely to lie—the joint model framework allows us to estimate free interaction parameters when possible, and use estimates when not. An important distinction to make is that the NDDM estimates identifiable interactions only, whereas the RIM estimates all interactions, pairwise identifiable and pairwise unidentifiable. However, by maximising Equation (4) we allow both models to provide good fits to the data but also for and to “adjust” around inferred values of .
2.6 Making interaction parameters comparable across focal species
Though this scaling step is not strictly necessary, for ecological datasets these scaled interaction strengths have the benefit of being directly comparable both across species and across environmental contexts where reproductive fitness is likely to vary (Wootton & Emmerson, 2005).
2.7 Integrating interaction strengths into models of population dynamics
In certain instances, models of species population dynamics can be used to extend the usefulness of the framework presented here. We suggest two cases where this may be useful. Firstly, the variable chosen to measure the performance of focal species may not directly translate into a measure of performance which is relevant to system dynamics, due to practical constraints with collecting empirical data. For example, the life-history reproductive strategies of certain species may lead to measures of high performance in the field which do not account for low survival rates post-observation (Broekman et al., 2020). In these cases, population dynamics models can be used to include additional species-specific demographic rates into estimates of interaction effects. Alternatively, we might be more interested in the effects of interacting species on the density or growth rate of a focal species rather than on its performance. In this scenario, a population dynamic model can be used to translate interaction effects on performance into interaction strengths affecting the variable of interest.
In both cases, an established population dynamic model is required as well as knowledge of crucial species-specific demographic rates. This step is illustrated for our case study in the Supporting Information S1.5, where we use annual plant population dynamic model to transform effects on wildflower seed production (proxy for performance) into effects on population growth, and includes species-specific estimates of seed germination and survival rates.
2.8 Model fitting
We implement the NDDM (Equation 1), the RIM (Equation 2), and the joint model as generalised linear models in STAN (Carpenter et al., 2017), using R version 3.6.3 (R Development Core Team, 2020) and the rstan package (Stan Development Team, 2020). Using STAN requires translating the model formula into the STAN language, setting priors for parameters to be estimated, and using an indexing system to discriminate between identifiable and unidentifiable interactions. We provide a working example of the STAN code as well as R scripts and functions to run and fit the model on a simulated dataset in our GitHub repository https://github.com/malbion/JointModelFramework, see Supporting Information S1.1 for additional comments on the code. From the model file, only the family function, the link function for and its parameterisation need to be modified in order to apply it to a differently-distributed dataset. Additionally, non-integer measures of performance (e.g. biomass) should be redefined as real rather than integers in the data block. In the code given, a negative binomial distribution is used to fit seed production, but a different distribution may be more appropriate when using other measures of performance.
STAN returns parameters as distributions which maximise the likelihood, and are conditioned by the data and priors. Priors describe the distribution of plausible values which parameters may take. For an introduction to Bayesian inference which relates the use of priors to frequentist hypothesis testing (see Ellison 1996). We recommend investigators experiment with setting different informed priors to both improve model convergence and verify the robustness of parameter estimates. The resulting parameters are termed posterior distributions, from which samples are drawn for analysis. Using parameter distributions rather than point estimates allows for easy inclusion of uncertainty, we therefore recommend sampling from each posterior interaction strength distribution to create multiple samples of the community interaction matrix. See Ellison (2004) for a comprehensive review of parameter estimations.
2.9 Assessing model convergence
We first evaluated convergence of the NDD-only model, the RI-only model, and the joint NDD-RI model when fit to simulated data created with the simul_data() function available in this project's associated GitHub repository. In these test fits, we ran the models with 4 chains and 3000 iterations, of which the first 2000 were discarded. We varied the total number of species and the proportion of unidentifiable interactions across our simulated datasets; when fitting the NDDM-only model, unidentifiable interactions were assigned a value . We observed good convergence of the and parameters in all models as evaluated by the statistic () and visual inspection of traceplots (results not shown here). As expected for the RI-only model and the joint model, our latent variables and often showed sign switching. This means that different MCMC chains returned coefficient values which are of the same magnitude but with opposing signs. Whilst this affects the statistic of these and parameters, it does not affect the convergence of the resulting interaction parameters (i.e. ), hence why we excluded and from our evaluation of convergence.
2.10 Dealing with sparse networks
In horizontal systems, interaction networks are expected to be non-sparse because species typically interact via a small set of shared, limiting resources. Nonetheless, it is likely that the posterior distributions of some interaction estimates returned by the model will overlap with zero. An overlap with zero may be due to several factors and does not necessarily equate to an interaction being insignificant. Firstly, an overlap with zero may arise when an interaction is positive or negative depending on local conditions. Secondly, it may indicate the interaction is poorly informed and hence overlaps with zero because it has a large posterior distribution reflect a lack of confidence to its effect. Lastly, the interaction may be well-informed but weak, in which case the posterior is centred around zero.
If, contrary to our assumptions, a user of our framework has good cause to believe that many interactions are weak and the network is instead sparse, there are multiple approaches for dealing with this issue. For example, setting strongly informed priors centred around zero for specific interactions that are thought to be negligible would be relatively easy to implement in the code we provide. More general alternative methods which have been developed explicitly for sparse interaction networks already exist, albeit where sparsity is applied across all interactions (Weiss-Lehman et al., 2022). Forbidden links are a subset of potential interactions which cannot be observed, often due to physical constraints (e.g. biological mismatch) or spatio-temporal uncoupling. For example, a pair of short-lived annual plants might have non-overlapping growing seasons and thus never overlap in the field. We direct the reader towards the literature on forbidden interactions (Jordano, 2016; Olesen et al., 2011) for solving these cases.
2.11 Case study
We applied this framework to an annual wildflower community dataset from Western Australia (Bimler et al., 2023), collected in 2016 (permit number SW017856 for” Flora take for a Scientific or Other Prescribed Purpose Licence” issued by the Western Australia Department of Parks and Wildlife). This dataset contains over 5000 observations of individual plant seed production from 22 different focal species, and the identity and density of all neighbouring individuals within a 3 to 5 cm radius of the focal individual. The environmental heterogeneity of this system is well studied (Dwyer et al., 2015) and we thus know that at the local scales across which this system was surveyed, only soil phosphorous, shade and the presence of woody debris impact on diversity and abundance patterns. To account for this known environmental heterogeneity, we randomly thinned plots to decouple abundance and environment effects (Supporting Information S1.4.1). We used our framework to quantify interactions between these 22 focal species and 52 neighbour species, and derived scaled interaction strengths with a well-supported population dynamics model for annual plants with a seed bank (Bimler et al., 2018; Levine & HilleRisLambers, 2009) which required experimentally-measured species demographic rates (Supporting Information S1.4.2). Viable seed production was used as the measure of performance and modelled with a negative binomial distribution and a log link. Further details on the procedure for deriving scaled interactions from the population dynamics model are available in the Supporting Information S1.5.
We fit all three models to the case study data with 4 chains and 7000 iterations, of which the first 5000 were discarded. All parameters for the NDDM-only model converged well; however, some of the and parameters for the RIM-only and joint models received values over . High values without other warnings are commonly associated with posteriors that are highly correlated and whose geometry is hence difficult to traverse (Stan Development Team, 2022). The impact of such “problematic geometries” however, is dependent on the data at hand, as evidenced by reliable convergence of all models on the simulated data. We explain why this may be the case for the and parameters in our case study and how this affects the of and in the Supporting Information S1.3. By comparing the fits across models and the resulting posterior distributions of model parameters, we remained comfortable using the joint model estimates returned by multiple chains to further study the interactions underlying the case study data. Supporting Information S1.3 goes into further detail in this regard and, in particular, shows how the posterior intervals of the model parameters remain informative despite high values. Model parameters were sampled 1000 times from the 80% posterior confidence intervals to construct our parameter estimates. We then applied bootstrap sampling from each resulting interaction strength distribution to create 1000 samples of the community interaction network.
3 RESULTS
The joint model framework returns a matrix whose elements quantify the effects of interacting species (columns) on the performance of focal species (rows). The interactions (values) which make up can be positive or negative, non-symmetrical (the effect of element on does not necessarily match the effect of element on ) and include intraspecific effects (the effect of element on itself). We illustrate the advantages of this approach in the case study results below.
3.1 Case study results
The model returned estimates for all 1144 interactions between 22 focal species and 52 interacting species, of which 56.7% were identifiable and estimated by the NDDM. When accounting for interactions between focal species only, 82.0% of interactions were identifiable. We conducted a posterior predictive check comparing simulated performance data from the joint model to observed values (Figure 1). This is especially important for verifying that the appropriate distribution and link function are being used for the data at hand. The joint model also returns simulated performance data for the RIM only, which we also checked visually (Figure S3). Model parameters were sampled 1000 times from the 80% posterior confidence intervals returned by STAN to construct our parameter estimates. Interaction estimates between focal species were scaled according to the annual plant population model (Supporting Information S1.5) into interaction strengths affecting population growth, and we sampled from each resulting interaction strength distributions to create 1000 samples of the scaled community interaction matrix.

For our case study, the resulting scaled community interaction matrix was non-symmetrical and included both positive (competitive) and negative (facilitative) values, as shown in Figure 2a,b. Here, we represent the community matrix as a network between all 22 focal species, taking the median value of each scaled interaction across all samples. Overall, the median strength for 55.8% of focal focal interactions were competitive, making competition the dominant interaction type. The median of 44.2% of focal focal interactions, however, were facilitative. As a result, the median of 47.6% of interactions between pairs of focal species were of opposing signs such that competes with but facilitates . Furthermore, the elements of the diagonal (the effect of a species on itself) were able to be estimated, which allows us to quantify how much a species regulates its own performance. Median intraspecific interaction strength was competitive for focal species (59.1%) and facilitative for the remaining (40.9%). For 11 of our 22 focal species, the scaled distributions of intraspecific effects did not overlap with , suggesting that individuals of those species have a non-trivial effect on other individuals of the same species. When considering interspecific interactions with neighbouring focal species, the proportion which did not overlap with dropped to 25.5%.

3.2 Examples of ecological applications
We illustrate a few potential applications of our framework by exploring questions of common ecological relevance with our case study. Each question below highlights some of the advantages of our resulting interaction network: intraspecific interactions, non-symmetrical interactions and the ability to estimate positive and negative interactions.
3.2.1 Q1: Do abundant natives under-regulate their population density compared to rarer native species?
One hypothesis as to why certain plant species are more abundant than others is that they tend to compete with themselves less strongly than rare species (Yenni et al., 2012, 2017). Hypothetically, this release from intraspecific competition pressure allows them to reach much higher densities than species which strongly compete with themselves. In our case study, we explore this hypothesis by plotting the effect of a species on itself against its density as in Figure 3a. Intraspecific interactions are at their weakest when close to . The two most abundant native species Velleia rosea (VERO) and Podolepsis canescens (POCA) highlighted in purple fall very close to the median intraspecific interaction strength. This suggests that V. rosea and P. canescens do not reach high densities through an under-regulation of their population density and are thus likely to be reaching these densities through other means such as access to a larger niche space.

3.2.2 Q2: Which species are keystones in this system?
Keystone species have disproportionately strong effects on other species in their system and the dynamics of the whole ecosystem, based on their abundances (Libralato et al., 2006; Piraino et al., 2002; Power et al., 1996). As such their exclusion from a community is expected to create significant changes in species density and composition (Paine, 1969). Though determining which species truly serve keystone roles has historically involved extensive ecological experimentation (e.g. Paine, 1992) and the inclusion of multiple trophic levels, we can identify potential candidates by comparing a species' impact on the population growth of other species to its own density (Libralato et al., 2006). It is important to note that because our framework allows for asymmetrical interactions, we are able to differentiate a species' impact on other species from its response or sensitivity to neighbours (Broekman et al., 2020). Figure 3b highlights a native species in green, Gilberta tenuifolia (GITE), which may be a potential keystone species due to having strongly competitive effects on the rest of the community overall, despite low density.
3.2.3 Q3: Do all exotic species compete with native species?
Though invasive species are expected to compete with natives (Corbin & D'Antonio, 2004; Naeem et al., 2000; Riley et al., 2008; Zheng et al., 2015), several studies have found evidence of exotic species facilitating natives, with cascading effects on other species and net positive effects on ecosystem processes (Ramus et al., 2017; Rodriguez, 2006; Wainwright et al., 2019). By allowing for positive and negative interaction strengths between species in a system, we can determine which exotics are harmful or beneficial to the native species in a community. Figure 3c plots the sum of a species' competitive effects on neighbours against the sum of its facilitative effects on neighbours. Exotic species are identified in red. Out of these three exotic species, two have overall competitive effects on the community: Arctotheca calendula (ARCA) and Pentameris aroides (PEAI). Both species have weak or close-to-median effects on their neighbours compared to other focal species. Hypochaeris glabra (HYPO) on the other hand has strong effects on other species, both competitive and facilitative, and has an overall facilitative contribution to the community. These effects are partly driven by its incredibly high germination rate (over twice as high as any other focal species). These results suggest that here at least, the effects of exotic species on native species are complex and species-dependent.
4 DISCUSSION
Our novel framework quantifies the effects of interacting species and reciprocal performance, allowing the estimation of diverse, horizontal interaction matrices. The resulting matrices are non-symmetrical and can contain both positive and negative interactions, as well as the effect of a species on itself. This framework is flexible to metrics of performance, type of group (e.g. species, population, etc.) and diversity. We also propose a way to approximate unidentifiable interactions given information about those which are identifiable, which is a significant feature given that networks for horizontal systems are expected to be non-sparse. The interaction matrices generated through this framework can be transformed into interaction networks through the use of models describing the system's interaction dynamics. Together, these features differentiate our framework from other methods currently available for estimating interactions in diverse horizontal systems and make it particularly useful in an ecological context (as illustrated in our case study) as well as flexible for use with data from the wide range of complex systems dominated by horizontal interactions.
In particular, our approach includes a method to infer all pairwise interactions despite ‘incomplete’ data. There are, generally speaking, two alternative strategies to deal with this issue, both of which attempt to reduce the number of interactions to be estimated. The first assumes that many species have similar effects on one another and can be grouped a priori according to biological factors (e.g. traits, life form; Martyn, 2020; Uriarte et al., 2004). The second assumes that a majority of interactions are weak and hence can be removed from the model through variable selection procedures (Mutshinda et al., 2009; Weiss-Lehman et al., 2022), resulting in a sparse interaction network. Weiss-Lehman et al. (2022) defined interactions as a combination of an average community-level measure and species-specific deviations from this average, and they used regularisation approaches to allow only a subset of these deviations to take non-zero values. In our case study, many interaction estimates overlapped with zero, though as discussed in the Methods, Section 2.10 this does not necessarily imply sparsity.
Ovaskainen et al. (2017) present a similar approach to ours but for time-series data. Their method assumes that interspecific interactions can be described by a small number of community-level drivers (effectively linear combinations of species abundances) which best predict future growth rates. Though this method requires all intraspecific interactions to be directly inferrable, the number of community-level drivers is much smaller than the number of species, allowing interspecific interactions between many species to be quantified despite relatively short time-series data. They compared their framework to both sparse and full interaction models, the latter performing poorly due to overfitting. Indeed, whether our framework provides a better fit to data remains to be tested, and will depend strongly on the particulars of both the data and system at hand. Nonetheless, rather than treat sparse and full interaction models as an either-or question, our joint method provides an ‘intermediate’ way to make use of data that has historically been insufficient or too incomplete to infer all pairwise interactions. We thus expect it will open up a wide range of questions that were previously difficult or impossible to answer. Though we illustrate our study with one particular ecological dataset, the method presented here could be adapted for use on a wider variety of horizontal systems such as those found in microbial, neural, and social networks.
Species interaction networks have a wide range of practical applications, such as evaluating ecosystem response to human-altered landscapes, guiding future management decisions (Cross et al., 2011) or exploring how communities may respond to global warming (Gorman et al., 2019). Conservation and ecosystem management efforts aimed at regulating species abundances can, for example, use the information provided by an interaction network to prioritise which species to conserve or eradicate based on their role in the community (Cirtwill et al., 2018). Identifying keystone, foundation and other important types of species roles is also helpful for understanding biological diversity, ecosystem integrity and functioning, especially in response to disturbances and other stresses (Losapio & Schöb, 2017; Narwani et al., 2019; Nyakatya & McGeoch, 2008; Orwin et al., 2016) though often requires the inclusion of other trophic levels. The examples we describe here are not exhaustive but serve to illustrate how horizontal interaction networks, especially when linked to population models, can help us understand both community dynamics overall and the effects & response of specific species towards the community.
Quantifying the community interaction matrix can also allow us to explore how the mechanisms maintaining diversity and stability operate in these systems and across a broad number of species. Self-regulation, for example, is an extremely important driver of community stability (Barabás et al., 2017) and arises from how individuals of the same species interact with one another. Measures of intra and interspecific interactions can also allow us to estimate niche overlap between species (for an example, see Chu & Adler, 2015); weak interactions between species suggest that they are not sharing or competing for many resources, and thus may have large niche differences in the community. Moreover, our inclusion of facilitative interactions, which have traditionally been disregarded in plant population models and mathematical frameworks of plant coexistence, provides a means to investigate the prevalence and strength of facilitation across multiple species and how it may act in relation to competition and species diversity. Recent work suggests facilitation may be more widespread than traditionally thought (Gross et al., 2015; Picoche & Barraquand, 2020) and is likely to benefit species diversity and stability in some systems (Brooker et al., 2008; Coyte et al., 2015).
Ultimately, quantifying species interaction networks allows us to apply tools from network theory to help us understand how these interactions drive community-level patterns of density and diversity. Several metrics already exist for describing horizontal network structure such as weighted connectance (Ulanowicz & Wolff, 1991) or relative intransitivity (Laird & Schamp, 2006), though these are fewer than for trophic or unweighted networks (e.g. Bersier et al., 2002; Delmas et al., 2019). Adapting measures of nestedness or modularity for example to non-sparse networks (as horizontal communities typically are) would allow us to further characterise how interactions and species are organised. These metrics relate to various aspects of stability and could greatly inform us on how diversity is maintained. Likewise, networks also provide several ways of measuring and describing species roles in their communities (Cirtwill et al., 2018) for example through the use of structural motifs, unique patterns of interacting species which together make up the whole network. Motifs have been found to have important biological meaning in food webs (Bascompte & Melian, 2005) but remain to be identified for horizontal networks.
AUTHOR CONTRIBUTIONS
Malyon D. Bimler designed the methodology, carried out analyses and led the drafting of the manuscript. Margaret M. Mayfield helped design the field study, collected data, contributed to the interpretation of analyses and critically revised the manuscript. Trace E. Martyn led the field study and data collection. Daniel B. Stouffer helped design the methodology, interpret analyses and critically revised the manuscript.
ACKNOWLEDGEMENTS
We thank M. Raymundo and I. Towers for the seed rate data used in this paper, C. Bowler, T. Britton and J. Ikin for their work in sample processing for this data set, as well as Rob Freckleton, Jacopo Grilli, Stefano Allesina and two anonymous reviewers for insightful comments on the manuscript. We also wish to acknowledge the Traditional Custodians of the Country from which the case study data was collected, the Yamatji People, as well as those of the land on which this research was conceived of, carried out, and written, the Jagera and Turrbal Peoples. This work was made possible by funding awarded to M.M.M. (DP170100837) by the Australian Research Council. D.B.S. is grateful for the support of the Marsden Fund Council, from New Zealand Government funding (grant 16-UOC-008). Open access publishing facilitated by The University of Queensland, as part of the Wiley - The University of Queensland agreement via the Council of Australian University Librarians.
CONFLICT OF INTEREST STATEMENT
The authors declare no competing interests.
Open Research
PEER REVIEW
The peer review history for this article is available at https://publons.com/publon/10.1111/2041-210X.14068.
DATA AVAILABILITY STATEMENT
Code is available in the GitHub repository https://github.com/malbion/JointModelFramework. Data presented in the case study is available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.h44j0zpq3.