layeranalyzer: Inferring correlative and causal connections from time series data in r

Distinguishing correlative and causal connections among time series is an important challenge in evolutionary biology, ecology, macroevolution and palaeobiology. Here, we present layeranalyzer, an r package that uses linear stochastic differential equations as a tool for parametrically describing evolutionary and ecological processes and for modelling temporal correlation and Granger causality between two or more time series. We describe the basic functions in layeranalyzer and briefly discuss modelling strategies by demonstrating our tool with three disparate case studies. First, we model a single time series of phenotypic evolution in a bird species; second, we extract cyclical connections in the well‐known hare‐lynx dataset; third, we infer the correlative and causal connections among the genus origination and extinction rates of brachiopods and bivalves. We summarize the advantages and limitations of using linear stochastic differential equations and layeranalyzer for studying correlative and causal connections.

Linear SDEs offer process models of how a given set of time series were generated, such that correlations and Granger causality (Granger, 1969) can be distinguished, especially for continuous time processes (Schweder, 1970). Granger causality is based on information flow. Roughly speaking, process B (Granger) causes process A if the predictive probability function for future states of process A is different when conditioning on all past states of all processes, and on conditioning on all past states of all processes, except process B (e.g. Eichler, 2013 and Supporting Information for details on causal connections).
Some advantages of linear SDEs as a time series tool are as follows. (a) The stochasticity of the processes can be separated from measurement noise, as the processes have temporal correlations while the latter does not. (b) Measurements do not need to be taken at equidistant time steps, unlike in ARIMA-based methods (Box, Jenkins, & Reinsel, 1994, see Chapter 2) and analysed time series can in fact be 'gappy'. This is especially useful for ecological and palaeontological data which are rarely equally spaced in time. Linear SDEs are analytically tractable (Reitan et al., 2012).
Here, we describe a new r package layeranalyzer that parametrically characterizes time series and models temporal correlation and Granger causality using linear SDEs. We provide three different examples to illustrate the types of ecological and evolutionary questions layeranalyzer can be used to answer.

| OVERVIE W OF L AYER A NA LYZER
layeranalyzer characterizes measurements of continuous time processes in the form of linear SDEs (for which each realized state is normally distributed, conditioned on the previous state), where measurement noise is also assumed to be normal. The analytically calculated expected values and covariance matrices (see Supplementary Information in Reitan et al., 2012) are used in combination with the Kalman filter, a recursive algorithm that performs process state inference (Särkkä, Vehtrari, & Lampinen, 2004) and calculates likelihood values simultaneously. Our Supporting Information contains a short verbal description of this algorithm. The Kalman smoother is used for process state inference (see Särkkä et al., 2004 and Supporting Information for details), giving inference for the process state at each given point in time, conditioned on all parameter values and all measurements. While both maximum likelihood (ML) estimation and Bayesian inference are possible in layeranalyzer, we focus on the latter using Markov chain Monte Carlo (MCMC) sampling. This is because Bayesian inference was more reliable than ML for varied scenarios we have explored and we found that ML estimation in our examples (and likely in general for layeranalyzer) requires good starting points provided by the MCMC samples, regardless.
layeranalyzer also allows for detecting whether an observed process (such as changing phenotypes, demographic dynamics or diversification rates) is affected correlatively or causally (Granger, 1969) by other measured and unmeasured processes. An unmeasured process with a causal connection to another process (itself measured or unmeasured) is termed a hidden layer, hence the name of the package.
The measured process is designated the top layer (i.e. layer 1), while the unmeasured, hidden process is the 'lower' layer (i.e. layer 2) in a 2-layered system. If another process (i.e. layer 3, the lowest layer in this case) drives layer 2, the system would be 3-layered, and so on. Unmeasured processes leave signatures on the autocorrelation structure of observed processes, allowing us to characterize such unmeasured time series. In addition to giving greater flexibility in describing the correlation structure of a time series, including layers in one's models can be motivated by the empirical system being studied. For instance, our earliest application aimed at distinguishing between the measured phenotype of fossil coccoliths (layer 1) and their optimum phenotype (layer The key function in the layeranalyzer package (Table 1) is layer.
analyzer, which analyses a single model (which may include multiple time series). Options for the layer.analyzer function include: a. Process structure: The default model fitted for a single time series is a one-layered Ornstein-Uhlenbeck process (Table 1).
However, non-stationary Wiener processes, linear trends or periodicity are alternative options available. Hidden layers can also be imposed.
b. Causal connections: One process x 1 (t) (measured or hidden) can affect another, x 2 (t), where β 1,2 denotes the strength of the causal connection. Specifically, if x 1 (t) changes by one unit, x 2 (t) will gradually change by β 1,2 units. This can be modelled with the option 'causal'. Causal connections can be uni-or bidirectional.

c. Correlative connections: A correlative connection is inferred if
x 1 (t) and x 2 (t) are affected by the same stochastic perturbatio and this is modelled using the option 'corr'. d. Group options. Comparable measurements from grouped locations can be analysed as a clustered set of processes with the same structure (see Reitan et al., 2012).

| C A S E S TUDY 1: ONE TIME S ERIE S OF PHENOT YPIC CHANG E
The data analysed in this example are the body size measurements of individuals of Acrocephalus scirpaceus over a 19-year period (Saetre et al., 2017). The authors suggested that these reed warblers evolved rapidly to a smaller optimal body size as described by an Ornstein-Uhlenbeck process.
We performed an automatic model search for process structure using the function traverse.standalone.layered after conforming the size measurements and their uncertainties into a layer.
data.series object. We summarized the ML based (option do. maximum.likelihood=True in traverse.standalone.layered) model search output using the function compare.layered, so we could compare our model selection results with those of Saetre et al. (2017). traverse.standalone.layered has an option for the maximum number of layers explored. In this example, we set this to two (i.e. maximum one hidden layer), with the lowest layer being either an OU process, an OU process with linear time trend or a Wiener process.
According to the layeranalyzer AICc model choice, there were no hidden layers in the body size data. However, an OU process with a linear time trend was slightly better than a pure OU process (see Supporting Information Table S1). This suggests that the size of A. scirpaceus started above the optimal body size, but the optimal body size was increasing (Figure 1

| C A S E S TUDY 2: T WO TIME S ERIE S OF P OPUL ATI ON G ROW TH
The Canadian hare-lynx system, first described by MacLulich (1937) and later Elton and Nicholson (1942), is a classic predator-prey cycle with feedback loops that has been repeatedly re- layeranalyzer can be used to infer feedback loops. Note that we here use count data from traps as a proxy for yearly abundance (as done by previous authors), which is in turn used as a proxy for the continuous expected abundance process in our models (see May, 1973).
Before analyses, we normalized the count data (see Supporting Information). We then applied the function traverse.standalone.layered on hares and lynx separately. These independent searches revealed that the normalized hare and lynx counts each conform to a two-layered model with negative feedback from first to second layer, suggesting a negative-positive feedback loop with an unmeasured process. In other words, a positive-negative feedback loop leading to cyclical behaviour can be gleaned from just one time series with layeranalyzer. We know from previous work that lynx abundance might be driving hare abundance. To explicitly explore this, we imposed an OU-like process as the structure for each of these two measured time series. Then we searched for connections between the hare and lynx data using the function traverse.
connections.layered and compared the resulting models using compare.layered. By rerunning the layer.analyzer function on the model deemed best using the smoothing.specs option, we provide estimates for the processes (Figure 2). We specified in smoothing.
specs, that in addition to performing inference for the measured points, we wanted inference for regularly spaced time points (ten per year), in order to get an almost continuous process state inference over time. We do not have estimates for the standard deviation of each measurement error for these datasets. For such cases, layer.analyzer introduces a parameter for each time series that represents a common standard deviation for all the measurement errors.

| C A S E S TUDY 3: THREE TIME S ERIE S OF D IVER S IFI C ATI ON R ATE S
In this example, we reproduce the analyses in Reitan and Liow (2017), with simplifications, to illustrate how three or more time series can be investigated simultaneously using layeranalyzer. (2017) investigated how the diversification dynamics of whole clades might influence those of different clades with similar niches using genus origination and extinction rates of bivalves and brachiopods estimated in Liow et al. (2015). The best model describing these processes is one where high bivalve extinction rates drove brachiopod origination rates through more than 450 million years of their evolutionary history . There was also a causal connection from bivalve to brachiopod extinction rates.

Reitan and Liow
We constrain the enormous number of joint process models examined in Reitan and Liow (2017) to 25 by (a) reducing the original six observed time series  to only three (b) only analysing models where one connection has been added, changed or removed relative to the model considered best by Reitan and Liow (2017) and (c) assuming the same prior knowledge of the system for the purpose of demonstration. Specifically, we assume that the structure of these three time series, namely extinction rates for brachiopods and bivalves (one hidden layer each) and the origination rates for brachiopods (only one measured layer) are known, to F I G U R E 1 Changes in Acrocephalus scirpaceus body size. Yearly mean natural log body mass (g) of individuals of A. scirpaceus, with their 95% confidence intervals. The solid line shows the inferred, linearly changing optimal body size from the best model F I G U R E 2 Lynx and hare process state inference. The plotted points are empirical data while black lines are estimated means and grey lines are 95% credibility bands. Lynx data start in 1897 and end in 1939 (with 1914 missing), while hare data start in 1864 and end in 1935. Note that the uncertainty bands make small 'bubbles' between each years as the models are in continuous time and the uncertainty increases between measurements reduce analysis time. Note that in addition to affecting their corresponding observed processes, hidden layers also function as common causes in a connection analysis. Since we are examining a fixed set of models rather than traversing all possible models, we use layer.analyzer to examine each of the 25 models and then use compare.layered to compare and evaluate them afterwards (see Supporting Information for details).
Using layeranalyzer, we find that the process model presented in Reitan and Liow (2017), was indeed the best one among the 25 models examined, according to the Bayesian model likelihood and resulting model probability (see Supporting Information for parameter estimates and comparisons with estimates from Reitan and Liow (2017)).

| C AUTI ONARY NOTE S AND FUTURE DE VELOPMENT FOR LINE AR S DE S AND L AYER A NA LYZER
Linear SDEs perform well in recovering correlative and causal processes under the set of scenarios we have explored (see Supporting Information in Liow et al., 2015). While a comprehensive exploration of the performance of this tool is out of the scope of the current work, we have provided simulation tools for users to investigate the behaviour of layeranalyzer (see the project web page https :// github.com/trond reita n/layer analyzer).
Since linear SDEs are Gaussian processes, the input data are assumed to be normally distributed, and two of our examples required transformation to meet this assumption. Additionally, in order to calculate an analytical likelihood, measurement noise should be also normally distributed.
Nonlinear connections between processes, such as in the Lotka-Volterra model for predator-prey relationships (Lotka, 1910) breaks the underlying assumption of linear SDEs. That said, such a linear model can still capture the salient features of nonlinear connection models, as demonstrated in the hare-lynx example. However, highly nonlinear connections cannot be correctly modelled using this tool, even with data-transformation. By using linear SDEs, there is also an implicit assumption that each process depends on its previous state (as well as the state of other connected processes) in a linear fashion. It is worth exploring how each residual might depend on the previous residual where this is a concern, by using, for instance, nonlinear regression tools.
Where nonlinearity is a serious concern, nonparametric tools such as cross convergent mapping (Sugihara et al., 2012;Ye, Deyle, Gilarranz, &amp;Sugihara, 2015)  In summary our tool layeranalyzer provides a means to characterize and quantify ecological and evolutionary processes from time series data.

ACK N OWLED G EM ENTS
We thank Camilla L. C. Saetre and Kjetil L. Voje for supplying, discussing the A. scirpaceus dataset. We thank Gene Hunt and an anonymous reviewer for constructive criticisms of our manuscript. We also thank Tore Schweder for setting us on the path

DATA AVA I L A B I L I T Y S TAT E M E N T
The project page is https ://doi.org/10.5281/zenodo.3406161  and package can be installed from Github using install_github(repo="trondreitan/layeranalyzer"). The data and example code are supplied in the package. The package will be maintained on github and on the first author's website.