Volume 7, Issue 3
Research Article
Free Access

A functional model for characterizing long‐distance movement behaviour

Frances E. Buderman

Corresponding Author

Department of Fish, Wildlife, and Conservation Biology, Colorado State University, Fort Collins, CO, 80523‐1484 USA

Corresponding author. E‐mail: franny.buderman@colostate.eduSearch for more papers by this author
Mevin B. Hooten

Department of Fish, Wildlife, and Conservation Biology, Colorado State University, Fort Collins, CO, 80523‐1484 USA

U.S. Geological Survey, Colorado Cooperative Fish and Wildlife Research Unit, Colorado State University, Fort Collins, CO, 80523‐1484 USA

Department of Statistics, Colorado State University, Fort Collins, CO, 80523‐1484 USA

Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO, 80523‐1484 USA

Search for more papers by this author
Jacob S. Ivan

Colorado Parks and Wildlife, Fort Collins, CO, 80526 USA

Search for more papers by this author
Tanya M. Shenk

National Park Service, Fort Collins, CO, 80525 USA

Search for more papers by this author
First published: 31 August 2015
Citations: 26
Correction note: Equation 14 appeared incorrectly in the HTML version of this article, (the PDF version has been correct since first publication), this has been corrected on 07 October after first online publication.

Summary

  1. Advancements in wildlife telemetry techniques have made it possible to collect large data sets of highly accurate animal locations at a fine temporal resolution. These data sets have prompted the development of a number of statistical methodologies for modelling animal movement.
  2. Telemetry data sets are often collected for purposes other than fine‐scale movement analysis. These data sets may differ substantially from those that are collected with technologies suitable for fine‐scale movement modelling and may consist of locations that are irregular in time, are temporally coarse or have large measurement error. These data sets are time‐consuming and costly to collect but may still provide valuable information about movement behaviour.
  3. We developed a Bayesian movement model that accounts for error from multiple data sources as well as movement behaviour at different temporal scales. The Bayesian framework allows us to calculate derived quantities that describe temporally varying movement behaviour, such as residence time, speed and persistence in direction. The model is flexible, easy to implement and computationally efficient.
  4. We apply this model to data from Colorado Canada lynx (Lynx canadensis) and use derived quantities to identify changes in movement behaviour.

Introduction

Data sets consisting of animal locations are often collected for purposes other than movement analysis (e.g., survival analysis, demographic studies; White & Shenk 2001; Winterstein, Pollock & Bunck 2001) or with technology that prohibits long‐term fine‐scale movement modelling (Yasuda & Arai 2005). For example, radiotelemetry may be used to estimate survival (Cowen & Schwarz 2005), but the locations may not be used in the analysis (e.g., Buderman et al. 2014; Hightower, Jackson & Pollock 2001). These data sets are costly and time‐consuming to collect, but often contain a wealth of unused spatial information. The ability to spatially characterize movement behaviours using data sets that are insufficient for fine‐scale movement modelling may help management and conservation agencies identify critical areas for wildlife movement (Berger 2004). In addition, with appropriate temporal data, researchers can also better understand mechanisms that regulate movement behaviour (Hays et al. 2014; Scott, Marsh & Hays 2014).

Runge et al. (2014) divide long‐distance movements into four categories: irruption (dispersal), migration, nomadism and intergenerational relays (which we do not address). Such movement behaviour can vary among individuals and over an individual's lifetime, though some species may be more inclined to exhibit one kind of long‐distance movement behaviour (LDMB) over another (Jonzén et al. 2011; Mueller et al. 2011; Singh et al. 2012). For most organisms, the causes and costs of dispersal will vary by individual and in space and time (Bowler & Benton 2005), resulting in a continuum of movement behaviours (Jonzén et al. 2011). LDMB may contribute substantially to population dynamics because it is the main determinant of population spread and colonization rates (Greenwood & Harvey 1982; Shigesada & Kawasaki 2002). Thus, LDMB is an important life‐history trait for many processes such as species invasions, range shifts and local extinctions, reintroduction programmes, metapopulation dynamics, connectivity and gene flow (Trakhtenbrot et al. 2005).

The spatial location of these behaviours could inform conservation efforts for species capable of long‐distance movements, as some behaviours may be more important than others for population persistence (Runge et al. 2014). In addition, comparing contemporary movement data with properly analysed historical data may identify changes in movement behaviour resulting from natural and anthropogenic disturbances. Changes in migratory behaviour could have wide‐ranging consequences in cases where the species contributes significantly to the biological assemblage (Robinson et al. 2009). Species are usually limited in their range by dispersal ability, foraging ecology or available habitat (Hays & Scott 2013; Wood & Pullin 2002), and as habitat fragmentation and climate variability increase, the ability of species to traverse long distance will become critical (Bowler & Benton 2005). Species that have the capability for long‐distance movement may be able to track habitat as environmental conditions change. However, individuals usually depend on a network of suitable habitats for different behaviours (e.g., breeding or migration; Robinson et al. 2009). Long‐term survival of the species can be reduced when the distance between patches exceeds dispersal ability (Trakhtenbrot et al. 2005), or when suitable habitat is not available for all of the behaviours that occur during an annual cycle (Robinson et al. 2009).

Although dispersal, migration and nomadism are all LDMBs, they may differ in characteristics that can be quantitatively measured, such as residence time, speed and persistence in direction (described using the turning angle). For example, areas where individuals are foraging or maintaining a home range may be identified by longer residence times or slower speeds (Schofield et al. 2013) and undirected motion (Morales et al. 2004). In contrast, movement may be faster (Dickson, Jenness & Beier 2005) and more directed (Haddad 1999) within corridors. Nomadic individuals may exhibit similar speeds as migrators and dispersers, but they would appear to be perpetually dispersing, with no consistent activity centre and a turning angle independent of previous movements (Lidicker & Stenseth 1992). Dispersal and migration may have similar speed and directional characteristics, but migration is a seasonally repeated movement between the same areas (Berger 2004) by individuals within a population (Sawyer, Lindzey & McWhirter 2005), whereas dispersal is a one‐way movement (Lidicker & Stenseth 1992.

Movement behaviour is typically monitored using very high frequency (VHF) or satellite telemetry devices. These monitoring devices are more effective at detecting LDMB than plot‐based studies, which may underestimate long‐distance movement (Koenig et al. 1996). The frequency of VHF data is determined by how often an individual can be located and are spatially restricted to the actively searched area. Aerial location accuracy associated with VHF data may be affected by antenna type, altitude and observer skill, while ground triangulation accuracy may be additionally impacted by terrain, vegetation, power lines, and weather (Mech 1983). In contrast, the intended fix rate for a satellite telemetry device is preprogrammed and often regularly spaced in time. Fix success rates and accuracy can be influenced by animal behaviour, such as diving behaviour, canopy cover, terrain and climatic conditions (e.g., Di Orio, Callas & Schaefer 2003; Dujon, Lindstrom & Hays 2014; Heard, Ciarniello & Seip 2008; Mattisson et al. 2010). The device's satellite system (GPS or Argos Satellite maintained by Service Argos) can also influence accuracy of the location observations (Costa et al. 2010; Dujon, Lindstrom & Hays 2014; Heard, Ciarniello & Seip 2008; Patterson et al. 2010; Vincent et al. 2002). In addition, fix success rate, battery life and accuracy may all depend on transmitter manufacturer and model. Both VHF and satellite components can be placed into the same device or individuals can be outfitted with two separate devices, resulting in data sets consisting of multiple data types.

Movement modelling often seeks to spatially characterize an individual's location as a function of time; however, this function may be highly complex and non‐stationary. In addition, measurement error varies among monitoring methods and can be large enough to overwhelm small‐scale movement patterns (Breed et al. 2011; Kuhn et al. 2009). Coupled with temporal irregularity and missing data, these attributes may prohibit the use of contemporary movement models. We have found that many available methods do not readily accommodate multiple sources of data and must impute missing data to obtain locations at regular intervals (e.g., Hanks et al. 2011; Hanks, Hooten & Alldredge 2015; Hooten et al. 2010; Johnson, London & Kuhn 2011). For example, the continuous‐time correlated random walk model presented by Johnson et al. (2008) only accounts for elliptical error distributions. Breed et al. (2012) incorporated an augmented particle smoother into a CRW process model to allow for time‐varying parameters; however, their method does not account for multiple data sources and its effectiveness was only demonstrated on highly accurate GPS data at a fine temporal scale (10–30 locations dayurn:x-wiley:2041210X:media:mee312465:mee312465-math-0001). Winship et al. (2012) incorporated multiple data sources (Argos, GPS and geolocation data) into a state‐space model, but the method performed poorly when there were data gaps, relied heavily on the estimates of Argos precision presented in Jonsen, Flemming & Myers (2005) and treated the GPS data as equivalent to the best Argos location class. Change‐point models require specifying or estimating the number of change points, and the change points are discrete in time (Gurarie, Andrews & Laidre 2009; Hanks et al. 2011; Jonsen, Flemming & Myers 2005; Jonsen, Myers & James 2007); modelling smooth transitions in the change‐point framework is more difficult. Given that one individual may exhibit many different LDMBs, we seek a model that is flexible enough to detect different types and degrees of movement behaviour, without specifying or estimating the number of change points. Brownian bridge movement models, a method commonly used with high‐resolution telemetry data, have been shown to work well only when the measurement error is negligible (Pozdnyakov et al. 2014), making them unsuitable for data sets obtained with VHF or Argos technology, which can be subject to substantial error. Recent applications of wavelet analyses also do not account for location error or uncertainty in the change‐point identification and are not feasible with sparse and irregular data sets (Lavielle 1999; Sur et al. 2014).

Basis functions are a useful set of tools for approximating continuous functions, such as movement paths, when ordinary polynomials are inadequate to describe the behaviour of the function (Rice 1969). Commonly used basis functions include wavelets, Fourier series and splines. Approximating a function with splines is computationally easy because the function is just a weighted sum of simpler functions (Wold 1974), and such tools have been incorporated into standard statistical software. Wold (1974) recognized that splines may be most useful in low information settings where the ultimate goal is to compare individual estimates of a few characteristic parameters that describe the curve. Basis functions have been used extensively in fields such as physics (e.g., Sapirstein & Johnson 1996), medicine (e.g., Gray 1992) and medical imaging (e.g., Carr, Fright & Beatson 1997), and climate science (e.g., Sáenz‐Romero et al. 2010). However, basis functions and associated statistical methods are less commonly used in ecology. Most applications focus on modelling species distributions (e.g., Lawler et al. 2006; Leathwick et al. 2005) and population dynamics (e.g., Bjørnstad et al. 1999), though splines have broad applicability in generalized additive models (Hastie & Tibshirani 1990; Wood & Augustin 2002). For example, Hanks, Hooten & Alldredge (2015) used B‐splines to model spatial transition rates as a function of location and direction‐based covariates and time‐varying coefficients. In addition, recent efforts have used B‐splines to estimate density functions associated with movement‐related behavioural states (Langrock et al. 2014). Tremblay et al. (2006) used Bezier, hermite and cubic splines as strict interpolators of irregular telemetry data from ocean‐obligate species; however, they assumed the filtered Argos locations were the true locations. There is also precedent in the statistical literature for the equivalence between stochastic movement processes, such as the Wiener process, and smoothing polynomial splines (Wahba 1978; Wecker & Ansley 1983).

We describe a functional approach to movement modelling using basis functions within a Bayesian model that accounts for multiple data types and their associated error, recognizing that the observed locations are not the true location. The basis functions allow us to account for temporal variation in the continuous underlying movement path without specifying movement mechanisms. We then use derived quantities, such as residence time, speed and persistence in direction, to characterize movement behaviour. In addition, the model is multiscale, allowing for movement behaviour at multiple biologically relevant temporal scales. We use this model to describe how reintroduced Canada lynx (Lynx canadensis) moved throughout Colorado. The two data collection methods, along with their measurement error and the sampling irregularity, make this an ideal data set to demonstrate the utility of our model.

Methods

Conventional functional data analysis (FDA) assumes that there is a continuous underlying process, but the observations are temporally discrete, may be subject to error and are temporally irregular (Ramsay & Dalzell 1991; Ramsay & Silverman 2002; Ramsay & Silverman 2005). Unlike traditional time series analysis, FDA does not assume stationarity or regularity of time intervals (Levitin et al. 2007). The continuous function of interest is approximated using basis functions, which are a set of patterns that capture the main shape of the curve (Ferraty & Vieu 2006; Hastie, Tibshirani & Friedman 2009; Ramsay & Silverman 2005). In our case, different sets of basis functions account for complexity in the process at different temporal scales, allowing us to detect both large‐ and small‐scale movement. In addition, FDA is useful when the objectives of an analysis are to estimate the derivatives of a function (Levitin et al. 2007; Ramsay & Dalzell 1991). In our framework, functions of temporal derivatives, such as residence time, speed and persistence in direction, are derived quantities that can characterize the movement path. The Bayesian framework allows for inference concerning these derived quantities and their associated uncertainty while incorporating multiple data sources; for our purposes, we incorporated VHF and Argos data into a single model.

Data Model

We consider each observed (centred and scaled) location, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0002 for a time urn:x-wiley:2041210X:media:mee312465:mee312465-math-0003 associated with data type j (j = 1,...,6 are Argos error classes and j = 7 denotes VHF), to arise from a multivariate normal mixture model with mean, z(t), representing the true location at time t and a covariance matrix urn:x-wiley:2041210X:media:mee312465:mee312465-math-0004 such that
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0005(eqn 1)
The covariance matrix, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0006, represents the error variance associated with each data type where the correlation matrix is
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0007(eqn 2)
for j = 1,...,6 and RI for j = 7. The prior distribution for the measurement error variance, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0008, was modelled as an inverse gamma, IG(q,r), where q is the shape parameter and r is the rate parameter. Argos error for all error classes has been shown to be larger than reported by Argos and greater in the longitudinal direction (Boyd & Brightsmith 2013; Costa et al. 2010; Hoenner et al. 2012); therefore, we use the parameter c, where urn:x-wiley:2041210X:media:mee312465:mee312465-math-0009, to scale the error variance to be less in latitude than longitude. The ρ parameter scales the degree of covariance between latitude and longitude and is modelled as urn:x-wiley:2041210X:media:mee312465:mee312465-math-0010.
The indicator urn:x-wiley:2041210X:media:mee312465:mee312465-math-0011 determines which mixture component gives rise to the observed location and is modelled as Bern(0·5). The covariance matrix of the rotated distribution, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0012, is calculated as urn:x-wiley:2041210X:media:mee312465:mee312465-math-0013 where urn:x-wiley:2041210X:media:mee312465:mee312465-math-0014 is a transformation matrix equal to
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0015(eqn 3)
for j = 1,…,6, and HI (the identity matrix) for j = 7. The mixture model accounts for the fact that Argos error locations do not follow a symmetric distribution around the true location, but are more likely to be found in and X‐pattern, due to the polar orbit of the satellites (Costa et al. 2010; Douglas et al. 2012). In preliminary analyses not presented here, the multivariate normal mixture model fit the data better than a multivariate normal non‐mixture model. Argos locations are commonly modelled with a t‐distribution to account for extreme outliers (following Jonsen, Flemming & Myers 2005), however, the mixture model allows us to model anisotropic outliers. Though the aforementioned studies have modelled or estimated Argos error, the information is not directly applicable in the form of priors because the mixture model is a novel method for modelling Argos error and there is significant variability in reported estimates of Argos error (Costa et al. 2010). Beginning in 2011, the Argos system implemented a new algorithm that provides an error ellipse, as opposed to a radius, for each location (Lopez et al. 2014). Recent work by McClintock et al. 2014b) used the ellipse parameters provided by the Argos system and a bivariate normal distribution to model the data.

Process Model

In the FDA paradigm, a continuous process for a set of times t urn:x-wiley:2041210X:media:mee312465:mee312465-math-0016 is written as an expansion of M basis functions of order k:
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0017(eqn 4)
where z(t) is the curve of interest, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0018 is a coefficient that determines the weight of each basis function in the construction of the curve, and urn:x-wiley:2041210X:media:mee312465:mee312465-math-0019 is a particular basis function (Levitin et al. 2007). The type of pattern present in the data dictates the best choice of basis function; for example, splines are often used for non‐periodic data, Fourier series for periodic data, and wavelet bases for data with sharp localized patterns. We employed the B‐spline basis, which is commonly used in semiparametric regression, because it has local support and stable numerical properties when the number of knots (the points at which the basis functions connect) is large (Keele 2008; Ruppert, Wand & Carroll 2003). However, the model we present is general enough to accommodate any type of basis functions. B‐spline basis functions are defined recursively according to the Cox‐de Boor formula (see De Boor 1978). Let urn:x-wiley:2041210X:media:mee312465:mee312465-math-0020 denote the mth B‐spline basis function of order k (cubic B‐splines are 4th order and 3rd degree) for the knot sequence τ, where kK. Then for m =1,...,N + 2Kk,
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0021(eqn 5)
where N is the number of interior knots (Hastie, Tibshirani & Friedman 2009).
In the spatial statistics and signal processing framework, a continuous stochastic process is often written as a convolution, or a moving average, of a smoothing kernel function, k(τ−t) and a latent process (e.g., white noise), η(τ):
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0022(eqn 6)
for urn:x-wiley:2041210X:media:mee312465:mee312465-math-0023 (Calder 2007; Higdon 2002; Lee et al. 2002). When discretized, eqn 6 takes on a general formulation eqn 4 (Calder 2007; Higdon 2002; Lee et al. 2002). Non‐stationary processes can be modelled by allowing the kernel to be a function of time (or space) and not just distance (Cressie & Wikle 2011; Higdon 2002; Higdon, Swall & Kern 1999). In the context of animal movement, one can consider the smoothing kernel as some function that imposes temporal dependence on the observed locations (the latent process) to create a continuous and smooth movement path.
In our case, the location of an individual at time t in each direction, z(t), is a function of an individual's geographic mean in that direction, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0024 and the summation of M cubic B‐splines evaluated at time t, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0025, and the regularized, direction‐specific coefficient, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0026, for that B‐spline. The location in longitude and latitude is
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0027(eqn 7)
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0028(eqn 8)
Using matrix notation, we can write eqn 7 and eqn 8 jointly as
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0029(eqn 9)
where z(t) is a vector describing the location in space at time t. The matrix X(t) is a 2‐by‐2M matrix where urn:x-wiley:2041210X:media:mee312465:mee312465-math-0030 is a row vector containing all of the B‐splines evaluated at time t, such that
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0031(eqn 10)
As such, it can be multiplied by a single 2M‐by‐1 vector of regularized coefficients
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0032(eqn 11)
The regularized coefficients for higher‐order splines are not generally interpreted (Weisberg 2014), but can be thought of as the contribution, or the directional forcing, of that basis function to the process at that time. The intercept, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0033, can be interpreted as the geographic centre of mass for each individual, for which we specified a relatively uninformative 2‐dimensional normal prior (Appendix S1). We specified a normal prior with mean 0 and covariance matrix urn:x-wiley:2041210X:media:mee312465:mee312465-math-0034 for the coefficients such that
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0035(eqn 12)
We selected three sets of B‐splines and varied the number of knots to align with temporal scales we believe are biologically important for lynx movement: year, season (3 months) and month. Including multiple sets of basis functions allows the continuous function to capture behaviour at different temporal scales without losing predictive capability when there is an absence of fine‐scale temporal data. However, the required number of knots results in a large design matrix of coefficients that is difficult to visualize; for example, there were 36 and 41 basis functions for the two Canada lynx presented in the case study. The number of basis functions will increase as the length of the time series increases. We used the covariance matrix
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0036(eqn 13)
as a regulator in the ridge regression framework to shrink the urn:x-wiley:2041210X:media:mee312465:mee312465-math-0037 coefficients. The variance terms, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0038 and urn:x-wiley:2041210X:media:mee312465:mee312465-math-0039, control the smoothing in each dimension; a very small variance leads to underfitting, whereas a large variance can lead to overfitting (Eilers & Marx 1996). We selected the variance components by calculating the Deviance Information Criterion (DIC; Spiegelhalter et al. 2002) over 10 000 MCMC iterations and optimizing the DIC over 400 pairs of variance components (Appendix S2). In simulation, we found that DIC and K‐fold cross‐validation methods performed similarly. The details of regularization and ridge regression are beyond the scope of this paper and are explored in more detail in (Hastie, Tibshirani & Friedman 2009) and Hooten & Hobbs (2015).
The model described above yields the posterior distribution
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0040(eqn 14)
where urn:x-wiley:2041210X:media:mee312465:mee312465-math-0041, w is a vector of the indicators urn:x-wiley:2041210X:media:mee312465:mee312465-math-0042, and S is a matrix of observed locations. This is the form of a typical ‘integrated’ model where multiple data sources provide information about the same underlying processes. Similar multidata source models have become popular in demographic studies (e.g., Barker 1997; Burnham 1993; Nasution et al. 2001; Schaub & Abadi 2011), but have not been as common in movement studies (but see Winship et al. 2012). If inference for multiple individuals is desired, the data model can be shared among individuals while the process model parameters (urn:x-wiley:2041210X:media:mee312465:mee312465-math-0043, β) and regulator urn:x-wiley:2041210X:media:mee312465:mee312465-math-0044) can be allowed to vary by individual. This model can be extended to account for additional stochasticity using a first‐order Gaussian process, such that urn:x-wiley:2041210X:media:mee312465:mee312465-math-0045, where urn:x-wiley:2041210X:media:mee312465:mee312465-math-0046 accounts for process error separately. Such Gaussian processes are commonly used as statistical emulators of complicated nonlinear mechanistic models (Hooten et al. 2011; O’Hagan & Kingman 1978).

See Appendix S1 for prior specifications. The model was fit using Markov chain Monte Carlo (MCMC), and a Gibbs sampler was constructed to sample from the posterior using the full‐conditional distributions for all parameters except ρ and c, because they were not conjugate. Metropolis‐Hastings was used to sample ρ and c. See Appendix S3 for R code (R Core Team 2013).

Characterizing Movement

We are interested in quantities derived from z(t) that can be used as movement descriptors. We describe three relevant derived quantities; however, our framework can be extended to other systems and conservation questions by modifying these quantities. These derived quantities represent the physical outcome in the movement path from various movement behaviours. The Bayesian framework allows us to obtain inference for derived quantities through Monte Carlo integration. We can visualize these quantities both temporally and spatially. All quantities are calculated in the MCMC algorithm using techniques described in Appendix S4 (spatial quantities) and Appendix S5 (temporal quantities).

To describe the quantities of interest spatially, we define a grid of equally sized regions, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0047 for l = 1,...,L, that comprise the area for which we desire inference. This method is similar to that used by Johnson, London & Kuhn (2011) to describe diving behaviour of northern fur seals (Callorhinus ursinus). The first derived quantity we describe is residence time, urn:x-wiley:2041210X:media:mee312465:mee312465-math-0048 and is calculated on each MCMC iteration as a per area frequency of locations in region urn:x-wiley:2041210X:media:mee312465:mee312465-math-0049:

urn:x-wiley:2041210X:media:mee312465:mee312465-math-0050(eqn 15)
where the indicator I identifies whether location z(t) was in region urn:x-wiley:2041210X:media:mee312465:mee312465-math-0051.
The second derived quantity of interest is speed. To calculate the average speed per unit of area, we first need the velocity between the location at time z(t) and the location at time z(t−Δt). When Δt is sufficiently small, the first derivative of z(t) with respect to t can be approximated by
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0052(eqn 16)
where
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0053(eqn 17)
In practice, Δt is constant for the entire time series, and velocity is related to speed ν(t) such that
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0054(eqn 18)
The average speed in urn:x-wiley:2041210X:media:mee312465:mee312465-math-0055, given a positive residence time, is
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0056(eqn 19)

A large average speed describes areas where the individual was moving quickly and spending little time. Therefore, large average speeds eqn 19 identify areas that individuals may use to travel.

Persistence in direction is the third metric of interest and may be useful for describing directed, as opposed to nomadic, movement. We can describe persistence in direction by deriving the turning angle, θ, using the velocity calculated in eqn 17,
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0057(eqn 20)
Given that residence time is positive, the average turning angle, θ(t), in region urn:x-wiley:2041210X:media:mee312465:mee312465-math-0058 is
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0059(eqn 21)
Alternatively, we can describe these quantities temporally, negating the need for a spatially defined grid. This decreases computation time and allows the quantities to be visualized temporally and spatially. Speed and persistence in direction can be calculated as they were in eqn 18 and eqn 20 and residence time can be calculated as the inverse of speed:
urn:x-wiley:2041210X:media:mee312465:mee312465-math-0060(eqn 22)

Case Study: Canada Lynx Reintroduction in Colorado

Colorado Division of Wildlife (now Colorado Parks and Wildlife) initiated a reintroduction programme for Canada lynx (Lynx canadensis) in 1997. Between 1999 and 2006, 218 wild‐caught lynx from Alaska, Yukon Territory, British Columbia, Manitoba and Quebec were released in the San Juan Mountains within 40 km of the Rio Grande Reservoir (Devineau et al. 2010). Individuals were fitted with either VHF collars (Telonicsurn:x-wiley:2041210X:media:mee312465:mee312465-math-0061, Mesa, AZ, USA) that were active for 12 h per day or satellite/VHF collars (Sirtrackurn:x-wiley:2041210X:media:mee312465:mee312465-math-0062, Havelock North, New Zealand) that were active for 12 h per week with locations obtained using the Argos system (Devineau et al. 2010). Weekly airplane flights were conducted over a 20 684 kmurn:x-wiley:2041210X:media:mee312465:mee312465-math-0063 area, which included the reintroduction area and surrounding high‐elevation sites (>2591 m; Devineau et al. 2010); attempts were made to locate each VHF‐collared individual in the study area once every 2 weeks. Additional flights outside of the study area were conducted when feasible and during denning season (Devineau et al. 2010). Accuracy of VHF locations was self‐reported as 50–500 m (Devineau et al. 2010). Irregular location data were obtained from 1999 to 2011 due to one or both of the transmitter components failing, logistical constraints or movement out of the study area precluding VHF data collection. Therefore, data for each individual vary in the length of the time series, the temporal regularity of locations and the number of locations from each data type and error class. We have analysed the telemetry data from two Canada lynx (Appendix S6).

We obtained 10 000 MCMC iterations, with a burn‐in period of 1000 iterations. All data used in this paper are available in Appendix S7. Additional results from fitting the model to simulated data are available in Appendix S8.

Results

To visualize the fit of the model to the data, we calculated standard posterior quantities, such as means and 95% credible intervals for the marginal location in each direction (Fig. 1a, b). Increasing uncertainty is evident during long periods of missing data (Fig. 1a, b). The derived quantities were scaled relative to the maximum value for that quantity over the individual's lifetime and plotted both spatially, on a map of Colorado (Figs 1c, d, and 2), and temporally (Fig. 2). These relative values are useful for visualizing the degree of each behaviour at a given time point, despite the quantities having different units; the degree of shading represents the strength of that behaviour, with the size corresponding to the spatial uncertainty (Figs 1c, d, and 2). The optimal variance terms for the regulator matrix eqn 13 and mean and 95% credible intervals for the covariance matrix eqn 2 are presented in Appendix S6.

image
Mean and 95% credible intervals of the marginal locations for two Canada lynx [BC03M04 (a) and BC03F03 (b)], with the observed locations. The posterior mean of each movement descriptor, shown with the counties of Colorado, for individuals BC03M04 (c) and BC03F03 (d). The size of the point corresponds to spatial uncertainty, and the transparency indicates the strength of the behaviour at that location; for visualization purposes, any value below 25% of the maximum value for that behaviour is not shown. Coordinates correspond to Universal Transverse Mercator zone 13N.
image
Mean relative movement descriptors through time and space for two Canada lynx reintroduced to Colorado [BC03M04's (a) and BC03F03's (b)]. Coordinates correspond to Universal Transverse Mercator zone 13N.

Both individuals had multiple periods of fast speeds, large turning angles and high residence times (Figs 1c, d and 2). For these individuals, high residence time often indicated a corresponding large turning angle; however, these behavioural quantities were not always concurrent (Fig. 2). For example, individual BC03M04 displayed periods early in the time series where the turning angle was the strongest quantity, while speed and residence time were fairly low, suggesting a searching or nomadic behaviour (Fig. 2a). Both time series culminated with the individuals residing in two specific counties (Clear Creek and Summit), which includes an area that is considered important lynx habitat (Loveland Pass; Colorado Parks and Wildlife, pers. commun.). These results also indicate that lynx are capable of consistent long‐term movement across large distances without establishing an area of high residence time. For example, within a period of two months, individual BC03F03 travelled approximately 480 km (posterior mean), from the southern portion of Colorado (Mineral County) to southern Wyoming (Medicine Bow National Forest, specifically the area located within Carbon and Albany counties; Fig. 2b).

Discussion

The process model we propose falls within the same class of models as statistical emulators, functional data models and process convolutions, and we showed that it can be written in much the same way. The model presented could be written as a hierarchical model, by allowing the latent process to be stochastic. However, it is well known that hierarchical models with two sources of unstructured error and lacking replication will have identifiability issues (Hobbs & Hooten 2015). Given a situation where there are strong constraints on the movement process, it may be possible to separate data and process error. For example, Brost et al. (In Press) were able to separately estimate data and process error in a resource selection framework by constraining the spatial domain of the process. However, their study focused on a marine mammal, and therefore, the process can be constrained to the marine environment (Brost et al. In Press). Constraining movement in a terrestrial environment may be possible but is less intuitive and will impose strong assumptions.

In addition, one of the benefits of using a functional data approach is its flexibility, in contrast to more constrained mechanistic models. The simplicity of the process model results in greater computational efficiency than other available methods for movement modelling. For example, our model can be fit on the order of minutes for each individual, compared to other models that require on the order of days (e.g., Hooten et al. 2010; McClintock et al. 2014a). Although small‐scale movement patterns may be difficult to detect given the coarse temporal resolution and the large amount of measurement error associated with Argos locations, large‐scale movement patterns are easily discernible and informative. However, researchers analysing data at a finer temporal scale could discern small‐scale movement with properly scaled basis functions (e.g., daily or weekly).

The model can be used to estimate an animal's movement path alone, but is especially useful for learning about movement behaviours that describe how individuals are utilizing the landscape. For example, persistence in direction may be used to infer when and where an individual is migrating or dispersing, whereas variation in direction may indicate habitat suitable for a home range (Haddad 1999; Morales et al. 2004). In the data we analysed the movement descriptors corresponded with anecdotal evidence of lynx movement behaviour. Many existing methods for analysing location data explicitly model the quantities that give rise to the movement path (e.g., speed, turning angle, residence time, velocity), such that the quantities must be estimated while fitting the model (mechanistic models; e.g., Breed et al. 2012; Johnson et al. 2008; Jonsen, Flemming & Myers 2005; McClintock et al. 2012; Morales et al. 2004; Winship et al. 2012). In contrast, we use the equivariance property of MCMC to calculate derived quantities as well as the proper uncertainty associated with each behaviour (Hobbs & Hooten 2015). Alternative ad hoc methods could be used, such as calculating derived quantities based on the mean predicted path, but to ensure the validity of those quantities as estimators with proper uncertainty, a procedure like the one we describe is necessary. Quantities of interest beyond those presented can be derived, such as bearing or tortuosity, or summarized with respect to temporal and spatial features. However, our model would need to be adjusted to accommodate other sources of measurement error (e.g., GPS data).

The model that we developed may be particularly well suited for analysing data sets that have not been collected explicitly for movement analysis. These data sets may contain multiple data types, have large amounts of error and have been collected at a coarse temporal resolution. As such, they may not be conducive for fine‐scale mechanistic movement modelling. We used a data set that embodied these characteristics, the telemetry data from the Canada lynx reintroduction to Colorado, to demonstrate that the FDA approach can be used to estimate movement paths and associated movement descriptors. The biological inference from the derived movement descriptors can also be extended beyond what we show here. For example, our framework could be extended to incorporate spatial and temporal covariates into the process model, similar to the approach described by Hanks, Hooten & Alldredge (2015). In addition, the spatial distribution of the movement descriptors can be used to summarize movement behaviour across linear landscape features such as roads. Likewise, movement behaviour through nonlinear landscape features, such as National Parks, can be described with the average posterior mean of a movement descriptor within a spatial boundary. Our model can also be generalized for use with multiple individuals. In this case, the derived quantities can be aggregated to describe population‐level movement. This type of population movement model allows the Argos and VHF covariance matrices to borrow strength across individuals, potentially improving parameter estimates. Such extensions are the subject of ongoing research.

Acknowledgements

Any use of trade, firm or product names is for descriptive purposes only and does not imply endorsement by the US Government. Funding was provided by Colorado Parks and Wildlife (1304) and the National Park Service (P12AC11099). Data were provided by Colorado Parks and Wildlife. The authors would like to thank the anonymous reviewers for their constructive commentary that helped improve the manuscript.

    Data accessibility

    Case Study Data: uploaded as online supporting information in Appendix S6: Case study tables.

        Number of times cited according to CrossRef: 26

        • The challenges of estimating the distribution of flight heights from telemetry or altimetry data, Animal Biotelemetry, 10.1186/s40317-020-00194-z, 8, 1, (2020).
        • Common thresher shark Alopias vulpinus movement: Bayesian inference on a data-limited species, Marine Ecology Progress Series, 10.3354/meps13271, 639, (155-167), (2020).
        • Model‐based clustering reveals patterns in central place use of a marine top predator, Ecosphere, 10.1002/ecs2.3123, 11, 6, (2020).
        • Animal movement models for multiple individuals, WIREs Computational Statistics , 10.1002/wics.1506, 12, 6, (2020).
        • Navigating through the r packages for movement, Journal of Animal Ecology, 10.1111/1365-2656.13116, 89, 1, (248-267), (2019).
        • Bias correction of bounded location error in binary data, Biometrics, 10.1111/biom.13152, 76, 2, (530-539), (2019).
        • Predicting functional responses in agro‐ecosystems from animal movement data to improve management of invasive pests, Ecological Applications, 10.1002/eap.2015, 30, 1, (2019).
        • Estimating day range from camera‐trap data: the animals’ behaviour as a key parameter, Journal of Zoology, 10.1111/jzo.12710, 309, 3, (182-190), (2019).
        • Scale-insensitive estimation of speed and distance traveled from animal tracking data, Movement Ecology, 10.1186/s40462-019-0177-1, 7, 1, (2019).
        • Time-varying predatory behavior is primary predictor of fine-scale movement of wildland-urban cougars, Movement Ecology, 10.1186/s40462-018-0140-6, 6, 1, (2018).
        • Accounting for location uncertainty in azimuthal telemetry data improves ecological inference, Movement Ecology, 10.1186/s40462-018-0129-1, 6, 1, (2018).
        • Capturing foraging and resting behavior using nested multivariate Markov models in an air-breathing marine vertebrate, Movement Ecology, 10.1186/s40462-018-0134-4, 6, 1, (2018).
        • Process convolution approaches for modeling interacting trajectories, Environmetrics, 10.1002/env.2487, 29, 3, (2018).
        • Animal movement models for migratory individuals and groups, Methods in Ecology and Evolution, 10.1111/2041-210X.13016, 9, 7, (1692-1705), (2018).
        • Large‐scale movement behavior in a reintroduced predator population, Ecography, 10.1111/ecog.03030, 41, 1, (126-139), (2017).
        • Bayesian Inference for Multistate ‘Step and Turn’ Animal Movement in Continuous Time, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-017-0286-5, 22, 3, (373-392), (2017).
        • References, Animal Movement, 10.1201/9781315117744, (273-290), (2017).
        • Modelling Animal Activity as Curves: An Approach Using Wavelet-Based Functional Data Analysis, Open Journal of Statistics, 10.4236/ojs.2017.72016, 07, 02, (203-215), (2017).
        • The basis function approach for modeling autocorrelation in ecological data, Ecology, 10.1002/ecy.1674, 98, 3, (632-646), (2017).
        • Inferring detailed space use from movement paths: A unifying, residence time‐based framework, Ecology and Evolution, 10.1002/ece3.3321, 7, 20, (8507-8514), (2017).
        • Guest Editor’s Introduction to the Special Issue on “Animal Movement Modeling”, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-017-0299-0, 22, 3, (224-231), (2017).
        • Reflected Stochastic Differential Equation Models for Constrained Animal Movement, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-017-0291-8, 22, 3, (353-372), (2017).
        • Imputation Approaches for Animal Movement Modeling, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-017-0294-5, 22, 3, (335-352), (2017).
        • Leveraging constraints and biotelemetry data to pinpoint repetitively used spatial features, Ecology, 10.1002/ecy.1618, 98, 1, (12-20), (2016).
        • Basis Function Models for Animal Movement, Journal of the American Statistical Association, 10.1080/01621459.2016.1246250, 112, 518, (578-589), (2016).
        • Hierarchical animal movement models for population‐level inference, Environmetrics, 10.1002/env.2402, 27, 6, (322-333), (2016).