Exploring population responses to environmental change when there is never enough data: a factor analytic approach

Temporal variability in the environment drives variation in vital rates, with consequences for population dynamics and life‐history evolution. Integral projection models (IPMs) are data‐driven structured population models widely used to study population dynamics and life‐history evolution in temporally variable environments. However, many datasets have insufficient temporal replication for the environmental drivers of vital rates to be identified with confidence, limiting their use for evaluating population level responses to environmental change. Parameter selection, where the kernel is constructed at each time step by randomly selecting the time‐varying parameters from their joint probability distribution, is one approach to including stochasticity in IPMs. We consider a factor analytic (FA) approach for modelling the covariance matrix of time‐varying parameters, whereby latent variable(s) describe the covariance among vital rate parameters. This decreases the number of parameters to estimate and, where the covariance is positive, the latent variable can be interpreted as a measure of environmental quality. We demonstrate this using simulation studies and two case studies. The simulation studies suggest the FA approach provides similarly accurate estimates of stochastic population growth rate to estimating an unstructured covariance matrix. We demonstrate how the latent parameter can be perturbed to show how selection on reproductive delays in the monocarp Carduus nutans changes under different environmental conditions. We develop a demographic model of the fire dependent herb Eryngium cuneifolium to show how a putative driver of the variation in environmental quality can be incorporated with the addition of a single parameter. Using perturbation analyses we determine optimal management strategies for this species. This approach estimates fewer parameters than previous approaches and allows novel eco‐evolutionary insights. Predictions on population dynamics and life‐history evolution under different environmental conditions can be made without necessarily identifying causal factors. Putative environmental drivers can be incorporated with relatively few parameters, allowing for predictions on how populations will respond to changes in the environment.

Interest in understanding the ecological consequences of environmental variation has increased rapidly as a consequence of global climate change (Evans, 2012;Stenseth et al., 2002). As experimental approaches to determining how natural populations are affected by environmental variation are frequently impractical, structured demographic models are often used to understand the population level effects of environmental change (Coulson, 2012). Environmental effects on vital rates can be complex, with nonlinear effects, multiple interacting drivers, indirect effects, and correlations between the drivers (Darling & Cote, 2008;Ehrlen, Morris, von Euler, & Dahlgren, 2016;Parmesan et al., 2013).
These challenges, and the relatively short length of many demographic datasets (Salguero-Gomez et al., 2015, mean it is often difficult to identify explicit environmental drivers of vital rates. This restricts the ability of models to predict how populations will respond to environmental change (Crone et al., 2013).
Stochastic demographic models, such as matrix population models (MPMs; see Caswell, 2001) and integral projection models (IPMs; see Ellner, Childs, & Rees, 2016), are widely used to study population dynamics in temporally variable environments (e.g., Inchausti & Weimerskirch, 2001;Vindenes et al., 2014). In an IPM the annual transitions are given by kernels, typically parameterised by estimating state-fate relationships. Stochastic models allow the state-fate relationships to vary temporally (or spatially), using either parameter or kernel selection (Metcalf et al., 2015). Under a kernel selection approach, a projection kernel is estimated for each year and these are resampled (Rees et al., 2006;Williams, Jacquemyn, Ochocki, Brys, & Miller, 2015); this preserves the covariance amongst the vital rates. Using a parameter selection approach, a unique kernel is constructed at each time step by randomly selecting the time-varying parameters from their joint probability distribution (Morris & Doak, 2002;Rees & Ellner, 2009;Vindenes et al., 2014). A potential limitation of the parameter selection approach is that an unstructured covariance matrix must be estimated for the set of timevarying parameters, often from relatively few temporal replicates.
An alternative to estimating an unstructured covariance matrix is to use a structured model for the temporal parameters (co)variances.
Hierarchical (multilevel) factor analysis (FA; Figure 1a), whereby one or more latent variables are introduced to capture the temporal covariance among vital rate parameters, is a promising candidate (Marcoulides &

K E Y W O R D S
Carduus nutans, covariation, environmental variation, Eryngium cuneifolium, factor analysis, integral projection model, life history, population dynamics F I G U R E 1 Structure of the stochastic vital rate models for the simple life-history simulation using the (a) factor analytic (FA) approach and (b) unstructured covariance matrix (UCM) approach. In the FA approach factor-loading terms (β Q ) allowed the direction and magnitude of the latent parameter (Q) to differ among the vital rates. Submodel-specific year effects (ε) accounted for additional variation among years. In the UCM approach a fully unstructured covariance matrix (Σ) was estimated by sampling the year effects (ε) from a multivariate normal distribution. β 0 parameters are the intercepts and β z are slopes with respect to size. The subscript t denotes stochastic parameters Moustaki, 2013). The latent variable(s) represent the underlying causes of covariation among observed variables, allowing complex multivariate relationships to be described in a simple way. Moreover, these models effectively capture hypotheses about causal variables that cannot be directly measured (Grace, Anderson, Olff, & Scheiner, 2010;Grace & Bollen, 2008). The FA approach can also be extended to include putative underlying drivers of variation in the latent variables, which allows covariance to be partitioned into explained and unexplained sources of variation. However, despite the broad use of FA approaches in ecological research (e.g., Ohlberger, Scheuerell, & Schindler, 2016;Thorson et al., 2015;Zuur, Fryer, Jolliffe, Dekker, & Beukema, 2003) they are rarely used to parameterise demographic models.
This approach has two potential advantages. First, fewer parameters need to be estimated relative to an unstructured covariance matrix.
Second, a small number of latent variables (often just one) may account for the covariation among the vital rates. When this covariance is positive,  (Ottersen et al., 2001). Such indices do not directly influence the vital rates, but as they provide an index of the overall climate conditions, incorporating multiple local climate variables, they are often better predictors of the vital rates than local climate variables (Post & Stenseth, 1999;Stenseth & Mysterud, 2005).
We conduct simulation studies to compare the accuracy of the FA approach to a standard parameter selection approach, with different numbers of temporally varying parameters. We then apply the approach in two case studies. We construct a demographic model of the monocarpic perennial Carduus nutans, and show how the latent parameter can be perturbed to make predictions about optimal life-history strategies under changing environments. We explore how selection for strategies to delay reproduction differs as the mean and variance of environmental quality changes. Finally, we develop a demographic model of the rare herb Eryngium cuneifolium to show how an environmental variable (time-since-fire) can be incorporated as a putative driver of variation in the latent variable. We use perturbation analyses to determine the optimal fire return interval (FRI) for managing this species.

| S IMUL ATION S TUDY: COMPARING FAC TOR ANALY TIC AND UNS TRUC TURED A PPROACHE S
We compared the accuracy of population growth estimates from the FA approach to those derived using an unstructured covariance matrix. We considered two scenarios: a relatively simple life history with four temporally variable vital rates (the "simple model"), typical of many published IPMs, and a two-stage (juvenile and adult) life history with a total of seven temporally variable vital rates (the "complex model"). Demographic rate functions in both settings were parameterised using data from a long-term study of the St Kilda Soay sheep (Clutton-Brock & Pemberton, 2004). These were used to construct a pair of density independent individual-based models (IBMs; Appendices A1.1.1 and A1.2.1), from which simulated datasets could be generated. Only the correlation coefficients for the temporally varying parameters were allowed to vary in each simulation, such that on each occasion, a correlation matrix was drawn at random from a uniform distribution over the space of positive definite matrices (using rcorrmatrix from the clusterGeneration package in R; Qiu & Joe, 2015).
One hundred simulated datasets of 8,000 years were generated from each of the two IBMs. A range of realistic dataset lengths were sampled: 12, 25, and 50 years (Appendices A1.1.1 and A1.2.1).
Multivariate demographic models were then parameterised using an unstructured covariance matrix (UCM approach) and a latent variable (FA approach) parameterisation ( In the simple model, 10 parameters (4 variance and 6 covariances) account for the temporal variation using the UCM approach, whereas the FA approach estimates eight parameters. In the complex model 28 parameters are required for the UCM approach and 14 for the FA approach. The demographic models were fitted using Bayesian methods, implemented in JAGS (Plummer, 2003) and run using the runjags package (Denwood, 2016) in r (R Core Team, 2016).
IPMs were constructed from each set of posterior samples (Appendices A1.1.2 and A1.2.2). The stochastic population growth rate (λ s ) was estimated after excluding the first 2,000 years of a 5,000 year simulation. This was repeated with 1,000 samples from the posterior. The true stochastic population growth rate (λ t ) was estimated using an IPM parameterised with the true parameter values used in the IBM.
The results of the simulation study are summarised in Figure 2.
The UCM approach led to marginally less diffuse estimates of stochastic population growth rate than the FA approach. This was true for both the simple ( Figure 2a) and complex ( Figure 2b) models. However, even with 12 years of temporal replication the differences between the performance of the two methods was small, and with 25 years of replication both methods performed

| Background and methods
Carduus nutans is a monocarpic thistle with a persistent seedbank and short-lived rosettes (Appendix A2.1; Popay & Medd, 1990;Wardle, Nicholson, & Rahman, 1992). We use a FA model to explore how environmental change may affect selection for reproductive delays in this species. Reproductive delays can act as a form of diversified bet hedging, spreading a cohort across multiple years and therefore decreasing the effect of a bad year on the cohort as a whole (Childs, Metcalf, & Rees, 2010;Cohen, 1966;Rees et al., 2006;Tuljapurkar, 1990). In monocarpic perennial plants, reproduction may be deferred preestablishment, through a seedbank, or postestablishment, through a delay in flowering (Childs, Rees, Rose, F I G U R E 2 Ratio between the estimated (λ s ) and true (λ t ) stochastic population growth rates for the factor analytic (FA) and unstructured covariance matrix (UCM) approaches for the ( Grubb, & Ellner, 2004;Rees et al., 2006). Post-establishment delays have the additional benefit of higher fecundity as individuals may grow larger, producing more seeds (Rees et al., 2006).
We define the fittest strategy to be the evolutionary stable strategy (ESS). The predicted ESS for the study population is substantial seed dormancy and the majority of plants to flower in their first year, with a flowering probability of c. 0.75 for an average sized individual (Rees et al., 2006). simple, as this is just an example case study for the factor analytic approach, we parameterise the IPM using the posterior means.
Thus we do not consider the effect of this uncertainty on the model output. By drawing samples randomly from the posterior instead it would be possible to give a measure of parameter uncertainty and the impacts of this on the results of the perturbation analyses below (e.g., Diez, Giladi, Warren, & Pulliam, 2014;Evans, Holsinger, & Menges, 2010). Posterior checks suggested the latent parameter (Q) accounted for the covariation among the vital rates (Appendix A2.4).
The positive covariance among the vital rates means the latent parameter can be assumed to be a measure of environmental quality.
The highest levels of temporal variation were in survival and recruitment (Appendix A2.4).
At each year in the simulation the latent parameter (Q) was sampled from a normal distribution with a mean of zero and a standard deviation of one. The submodel-specific year effects (ε) were drawn from normal distributions with means of zero and the standard deviations (σ t ) estimated in the vital rates model. The joint flowering intercept and germination probability ESS were predicted using numerical invasion analysis (Childs et al., 2004) and were similar to those produced using a fixed effects, kernel selection approach (Appendix A2.5; Rees et al., 2006). Environmental variability ESS germination probability, h standard deviations of Q were varied on a fixed grid and the ESS were predicted at each value. This was repeated for a range of seed mortalities (d = 0.01, 0.1, 0.2…0.9, 0.99; Rees et al., 2006).

| Perturbation analyses
As the quality of the environment deteriorates there is selection for earlier flowering and reduced germination, while improving the quality of the environment leads to the opposite response, that is, selection for a perennial life history dominates in higher quality environments (Figure 3 and Appendix A2.6). In lower quality environments selection acts on the germination probability,

| Background and methods
Eryngium is a fire-adapted perennial herb with a persistent seedbank (Menges & Kimmich, 1996;Menges & Quintana-Ascencio, 2004) found in Florida rosemary scrub, in recently burned or other disturbed areas (Menges & Kimmich, 1996). Fire kills the majority of rosettes and the population recovers through the seedbank (Menges & Kohfeldt, 1995). We used demographic data from a single population that forms part of a well-studied meta-population at the Archbold Altering the frequency of fires is one possible management strategy for this endangered species. The recommended FRI for this species of <15 years (Menges & Quintana-Ascencio, 2004) differs from the 15-30 year recommendations for its Florida scrubland habitat (Menges, 2007). Alternative management strategies may therefore be required for Eryngium. We use perturbation analyses to determine how altering FRIs and the effect of fire on the vital rates affects population growth.
The Eryngium IPM (Appendix A4.2) was structured by the natural logarithm of rosette diameter (Menges & Quintana-Ascencio, 2004). We assume density independent dynamics to investigate the persistence of the population (Menges & Quintana-Ascencio, 2004; see Appendix A5 for model with density dependent recruitment).
The intercepts of four vital rates were assumed to be temporally variable (survival, growth, flowering probability, and the number of flowering stems; Appendix A4.3). As the demography of Eryngium is strongly affected by fire, we modelled the mean of the latent parameter (Q) as a linear function of time-since-fire (TSF; Appendix A4.3).
Flowering and the number of flowering stems were highly correlated, so the flowering (ε f ) and (ε b ) year effects were sampled from a bivariate normal distribution. Sampling these parameters from univariate distributions results in the latent variable failing to fully account for the covariation among the vital rates (Appendix A4.3).
Posterior samples were again drawn using MCMC sampling in JAGS, using runjags (Denwood, 2016). Weakly informative priors were used (Appendix A4.3; see Appendix A3 for a comparison with more informative priors). The vital rates were negatively related with TSF, with survival particularly strongly affected (Appendix A4.4).
The posterior means were used to parameterise an IPM (Appendix A4.4). At each iteration, the latent parameter (Q) was randomly sampled from a normal distribution with mean β tsf × TSF and standard deviation of one. Submodel-specific year effects were drawn from normal distributions (bivariate normal for flowering and flowering stems), with means of zero and the estimated (co) variances. Estimates of germination probability range from 0 to 0.1 and 0.005 to 0.04 for first (h f ) and second year germination (h b ) respectively (Menges & Quintana-Ascencio, 2004;Quintana-Ascencio & Menges, 2000). To select a fertility scenario for the perturbation analyses predicted dynamics using a range of these estimates and of seed mortality probabilities (0.5, 0.7, 0.9) were compared to those observed in the field (Appendix A4.5). A model with low first year germination (0.0), high germination from the seedbank (0.04) and low seed mortality (0.5) was selected as it was consistent with observed changes in aboveground population growth (Figure 4a). That is, aboveground populations were predicted to increase immediately following a fire, but not beyond 10 years postfire (Menges & Quintana-Ascencio, 2004).

| Perturbation analyses
The effects of different fire regimes were explored using a range of constant FRIs from two to 30 years (Appendix A4.6). Stochastic population growth rates were estimated by iterating 100 populations for 1,000 years; the first 200 years were excluded as transient dynamics. We found populations were likely to decline where the time between fires was too short (c. <4 years), because plants do not produce enough seeds to replenish the seedbank, or too long (c. >15 years; Figure 4b), as they are outcompeted. This is in accordance with a previous study, using a matrix selection approach, which found an optimal FRI of less than 15 years (Menges & Quintana-Ascencio, 2004).
To determine how altering the effect of fire on the vital rates affected population growth the β tsf parameter was perturbed (Appendix A4.6). This is a measure of how quickly the environment decays as TSF increases; more negative values of this parameter indicate the quality of the environment decreases more quickly following a fire. Stochastic population growth rates were estimated as before, but the fire regimes were varied randomly throughout the simulations (with the same chance of each FRI occurring), either between 1 and 15 years (optimum for Eryngium) or between 15 and 30 years (optimum for Florida scrub habitat). Decreasing β tsf by around 1/3 could make a 15:30 year FRI strategy sustainable for Eryngium (Figure 4c). The effect of altering the temporal decay of the environment is much higher when the FRI is higher, to the extent that decreasing β tsf sufficiently can make longer FRIs preferable for Eryngium (Figure 4c).

| D ISCUSS I ON
Identifying the environmental drivers of variation in demographic performance is challenging. A variety of approaches have been proposed (e.g., Teller, Adler, Edwards, Hooker, & Ellner, 2016;Van der Pol et al., 2016), but the performance of any method is limited by the degree of temporal replication available. The mean length for a demographic dataset is 6 years in plants and eleven in animals (Salguero-Gomez et al., 2015). Yet, simulations suggest 20-25 years of data are needed to identify putative environmental drivers, determine the temporal window over which they act and reliably estimate the magnitude of their effects (Teller et al., 2016;Van der Pol et al., 2016). Efforts to identify drivers in many of these populations will not succeed, forcing population ecologists to assess the likely effects of environmental change using indirect methods. The observation that, in natural populations, different components of demographic performance covary, often positively, (Jongejans et al., 2010;Nur & Sydeman, 1999;Rotella et al., 2012) implies different demographic processes respond (at least in part) to the same drivers.
We have demonstrated how a factor analytic (FA) framework can be used to incorporate a temporal axis of environmental variation into a demographic model. The resulting multi-process model-coupled via a latent "environmental quality" variable-requires fewer parameters than its unstructured (UCM) counterpart.
The advantage of adopting the FA approach depends on the application domain. Accounting for vital rate covariation can be important for estimating stochastic population growth rate. For example, Metcalf et al. (2015) demonstrated that parameter and element selection approaches were roughly comparable in terms of accuracy as long as covariance amongst the vital rates were taken into account. In principle, an FA model might yield more precise estimates of population growth because it requires fewer parameters than its unstructured (UCM) counterpart, though this comes at a potential cost of increased bias when the model is insufficiently flexible. In practice, we found that the UCM and FA approaches yielded comparable estimates of population growth rate in our simulation study, indicating that sampling error was the dominant source of uncertainty in predicted population growth rate. Thus, the FA approach may offer little advantage over standard UCM models when estimating stochastic growth rate. The goal of a retrospective analysis is to attribute realised variation in a statistic such as population growth rate to different drivers of variation. Temporal covariances can be important in this context. For example, between one half and onethird of the variation in population growth was accounted for by covariation among vital rates in three ungulate populations (Coulson, Gaillard, & Festa-Bianchet, 2005), though this result is far from universal in nature (Compagnoni et al., 2016). The FA model could be used as a basis for such analyses, though in practice all this would achieve is to attribute covariance contributions to latent variable(s).
The main advantage of using a FA model is that it identifies the main axis of demographic variation, which then provides a basis for prospective analysis of how populations may respond to F I G U R E 4 (a) Aboveground population growth rate for Eryngium r = log environmental change. When it is not possible to explicitly identify environmental drivers of demographic variation, local perturbation analysis of model parameters can be used to explore the potential response of a population to environmental change (Rees & Ellner, 2009 poral variation this approach could also be used to predict joint demographic responses to spatial variation or extended to incorporate spatial-temporal variation (Elderd & Miller, 2016).
The key limitation is that this interpretation of the FA model assumes the temporal covariances are largely environmentally driven.
This may not be true if individuals substantially adjust their allocation strategy in response to environmental conditions. Negative correlations among the vital rates may exist due to life-history trade-offs between vital rates, where, for example, resources are invested in survival or reproduction to the detriment of growth (Koenig & Knops, 1998).
Negative correlations appear relatively rare, however (Jongejans et al., 2010), and where they do exist are sometimes attributable to opposing responses to environmental drivers (e.g., Knops et al., 2007). This suggests the magnitude of trade-off effects is generally small compared to that of environmental effects, though life-history trade-offs may still attenuate environmental driver(s) of covariation. Note that whilst a FA approach can still be used where the covariances among vital rates are negative the interpretation of the latent variable is more difficult here.
That is, the latent variable can only be assumed to be a measure of environmental quality where the vital rates positively covary.
Explicitly incorporating putative environmental drivers allows population responses to management strategies or anticipated environmental change to be predicted (e.g., Gotelli & Ellison, 2006;Isaza et al., 2016). The FA approach can simplify the process of incorporating such drivers, as they can be included into a single model of the shared environmental axis. Where explicit environmental drivers (e.g., population density or temperature) can be identified, these are typically considered on a process-specific basis, by constructing separate models for survival, reproduction, growth, and recruitment (e.g., Dahlgren, Ostergard, & Ehrlen, 2014;Williams et al., 2015). This would require the addition of four time-since-fire slope parameters in our Eryngium case study, one for each temporally variable vital rate (e.g., Evans et al., 2010). Instead we introduced a higher level model, decomposing the shared axis of environmental variation into explained and residual components of variation. Thus the effects of time-since-fire on all four vital rates were incorporated with the addition of a single parameter. This allowed us to evaluate the population level effects of two alternative management strategies; that is, altering the disturbance regime or ameliorating the environment to decrease the rate of decay in environmental quality following a disturbance.
We found that whilst the optimum FRI for Eryngium is less than 15 years, decreasing the rate of environmental decay could lead to persistent populations under 15-30 year fire regimes (the recommended FRI for the Florida scrub habitat; Menges, 2007).
However, in previous models the slope terms for the environmental quality parameter (β Q ) were constrained to a value of +1 or −1 among a set of demographic models. These usually operate on different scales. For example, probabilities such as survival and flowering are typically estimated on a logit scale, whereas fecundity is generally estimated on a log scale; a unit change on these two scales cannot be meaningfully compared. Moreover, differences in the magnitude of the effect of temporal variation among the vital rates were accounted for by the process-specific year effects (ε). Thus the main advantage of the FA approach is lost, as the latent variable cannot be conceived as a measure of overall environmental quality.
Adopting a Bayesian approach has a number of benefits (Elderd & Miller, 2016), for example, allowing the effects of difference sources of uncertainty to be quantified (Evans et al., 2010). Uncertainty is likely to be very high for most datasets (Metcalf et al., 2015).
Parameter uncertainty can have important ecological implications, for example, failing to account for it may underestimate the risk of extinction (Ludwig, 1996). Carrying out a range of checks following model fitting (Appendix A1.3) can help to increase the accuracy of the model output. Such checks are particularly important when fitting very constrained models, for example, when assuming the temporal covariation in the vital rates may be explained by a single environmental axis. Sometimes, as in the Eryngium case study, additional axes may be necessary to fully account for the covariation in the vital rates. We recommend starting with a simple model structure and slowly adding in complexity (Appendix A1.3).

Rapid levels of environmental change have increased interest in
determining how population processes respond to environmental stochasticity (Evans, 2012;Stenseth et al., 2002). However, the long-term individual level data needed to accurately quantify such responses are often lacking, especially for rare species. Where positive covariance exists among vital rates these can be exploited under a FA approach to allow predictions on the joint responses of vital rates to environmental variation. Where insufficient data exist to identify putative environmental drivers the FA approach may offer the best alternative for predicting population responses to environmental change.

ACK N OWLED G EM ENTS
B.J.H. was funded by a NERC (NE/L501682/1) and University of Sheffield PhD studentship. D.Z.C. was funded by a NERC fellowship