Predicting aquatic animal movements and behavioural states from acoustic telemetry arrays
Handling Editor: Chris Sutherland
Abstract
- Fine-scale tracking with passive acoustic telemetry can yield great insights into the movement ecology of aquatic animals. To predict fine-scale positions of tagged animals in continuous space from spatially-discrete detection data, state-space modelling through the R package YAPS provides a promising alternative to frequently used positioning algorithms. However, YAPS cannot currently classify multiple kinds of movement that may be used as proxies for individual behaviours of study animals (behavioural states), an endeavour that is of increasing interest to movement ecologists.
- We advance YAPS by incorporating the functionality to predict behavioural states by using an iterative maximization framework. Our model, which we call YAMS, occurs in continuous time and therefore we adapt current hidden Markov model (HMM) machinery to accommodate this while remaining within a likelihood framework that provides rapid fitting. We test our model using simulations and approximately 6 days’ worth of Northern pike data from Hald Lake, Denmark.
- YAMS is shown to produce accurate parameter estimates and random effect predictions when model results were compared to simulated data, with behavioural state accuracies of 0.94 and 0.79 for two- and three-state models, respectively, and location state root mean squared errors of 1.8 m for both models. In addition, the behavioural states are shown to reflect varying speeds of the pike, yielding a highly interpretable classification.
- This research has the potential to be broadly applicable to both ecologists interested in identifying fine-scale space use and behavioural states from acoustic telemetry data, as well as to statisticians who may wish to use standard HMM machinery to fit continuous-time HMMs to animal movement data.
1 INTRODUCTION
Telemetry is a staple technology used to track aquatic animals and infer how movement relates to physiology, life history, oceanographic and environmental constraints and anthropogenic actions (Hays et al., 2016; Hussey et al., 2015; Lennox et al., 2017). Acoustic telemetry has been instrumental for monitoring the movements of several taxonomic groups including teleost fishes, elasmobranchs, and crustaceans (Hussey et al., 2015; Lennox et al., 2017). With acoustic telemetry, we can track both large scale movements of individuals over extensive periods of time (years) and across oceans (McAuley et al., 2017), as well as fine-scale, high resolution movements restricted to small study areas (Cote et al., 2019).
Acoustic telemetry is a two-part system wherein receivers record ID codes that are transmitted from tags typically either surgically implanted or externally attached to the study animals. A transmission can only be recorded by a receiver if it originates within the receiver’s detection range. When a receiver array is designed such that detection ranges overlap (i.e. when receivers are spaced approximately 20–500 m apart; Roy et al., 2014; Trancart et al., 2020, and Figure 1 of Binder et al., 2017) and transmissions can be detected at multiple receivers, algorithms based on the differences among arrival times of a single transmission at multiple receivers can be used to calculate positions of the tagged animals on a spatially-continuous scale (e.g. Baktoft et al., 2017; Espinoza et al., 2011; Trancart et al., 2020). Positioning algorithms are often closed source (but see Trancart et al., 2020), expensive if carried out by the manufacturer, and the output can contain large amounts of error (Baktoft et al., 2017; Roy et al., 2014).

A recently developed alternative is the R package Yet Another Positioning Solver, or YAPS (Baktoft et al., 2017). This open source software fits a hierarchical (state-space) model with two levels to detection data: a measurement process that captures the variability in transmission arrival times relative to their expected arrival times, and an unobserved movement process that assumes the underlying track follows a Wiener process (Baktoft et al., 2017). For model fitting, YAPS utilizes the R package Template Model Builder (TMB; Kristensen et al., 2016), a highly effective framework for fitting hierarchical models to animal movement data (Albertsen et al., 2015; Auger-Méthé et al., 2017; Jonsen et al., 2019). By accounting for stochasticity in both the measurement and movement processes, YAPS achieves greater precision in the location predictions compared to classic time-difference-of-arrival methods (Baktoft et al., 2017).
State-space models (SSMs), including YAPS, have become a popular tool in ecology (Auger-Méthé et al., 2021), and have proven particularly useful for predicting true locations from noisy movement data (e.g. Auger-Méthé et al., 2017; Johnson et al., 2008; Jonsen et al., 2005; McClintock et al., 2012; Patterson et al., 2008; Pedersen et al., 2008). Hidden Markov models (HMMs) have also seen significant parallel development, as they can classify multiple discrete states influencing the movement process, and these can be inferred to reflect animal behaviour (Auger-Méthé et al., 2021; McClintock et al., 2020). Behavioural states can be readily predicted from data sampled at regular temporal intervals via discrete-time HMMs (Michelot et al., 2016; Zucchini et al., 2016). However, most aquatic research can only record movements at opportunistic (irregularly sampled) times. Irregular sampling, if incorporated into movement models, has often been integrated into the movement process through the use of differential equations (e.g. Johnson et al., 2008), or into the measurement process via linear interpolation (e.g. Jonsen et al., 2005; McClintock et al., 2012). Fewer studies incorporate it directly into the behavioural process (but see Parton & Blackwell, 2017; Michelot & Blackwell, 2019). When it is incorporated, standard computational machinery for HMMs typically cannot be used because the movement of an animal no longer depends only on the current behavioural state, but can depend on one or more past states as well; this is called the snapshot principle (Patterson et al., 2017). However, rigorous testing of deviations from the snapshot principle with respect to animal movement has not been documented.
A few methods exist for predicting both behavioural states and location states within a single statistical model (Jonsen et al., 2005; McClintock et al., 2012; Pedersen et al., 2008), which we call switching hierarchical models (SHMs). These efforts have traditionally focused on developing methods for satellite telemetry, and few methods exist specifically for acoustic telemetry detections. One exception is that of Dorazio and Price (2019), who developed a one-dimensional Bayesian SHM for linear movement within a river. In the two-dimensional case, HMMs have been directly fitted to positional data (e.g. Whoriskey et al., 2017), and to location predictions from SSM-filtered positional data (e.g. Cote et al., 2020). These methods typically require multiple separate modelling steps, for example, a positioning algorithm to obtain an animal path, followed by filtering, and finally an HMM to obtain behavioural states. A methodology that can simultaneously predict behavioural and location states directly from acoustic detections would provide a more unified statistical solution where uncertainty around the model parameters and state predictions is accounted for within a single framework. This should allow the computation of reliable confidence and prediction intervals for quantities of interest, as well as other statistical tools like likelihood ratio statistics to compare nested models. Such a methodology is currently unavailable.
Our research begins to fill this gap by advancing the YAPS model to additionally predict behavioural states. Because positions are sampled in continuous time, we employ a continuous-time Markov chain to model the behavioural state evolution. For model fitting, we follow the iterative framework outlined in Whoriskey (2021), which takes advantage of maximum likelihood theory in both the HMM and SSM paradigms to efficiently and accurately fit SHMs to animal movement data. We relax the snapshot assumption such that we can adapt standard HMM computational tactics and use simulations to evaluate the accuracy of our implementation. Model efficacy is demonstrated by fitting both two- and three-state models to approximately six days' worth of acoustic detections collected on a female carnivorous fish, the northern pike Esox lucius, throughout Hald Lake, Denmark.
2 MATERIALS AND METHODS
In accordance with Baktoft et al. (2017), we name our behavioural YAPS methodology ‘Yet Another hidden Markov model Solver’, or YAMS for short. YAMS has three goals: (a) predict a spatially- and temporally-continuous path from a series of spatially-discrete animal detections measured with error; (b) predict the evolution of behavioural states; and (c) accurately estimate the parameters governing the animal movement and detection processes.
2.1 Study system
Our study system is Hald Lake in Denmark (Figure 1), which covers an area of approximately 3.4 km2 with a max depth of 31 m (Jeppesen et al., 1999). From April 2019 to February 2020, 70 Thelma TBR 700 receivers were in place over the entire range of Hald Lake (Figure 1). Three species were tagged as part of a broader ecological study, including brown trout Salmo trutta, European eel Anguilla anguilla and northern pike, which was carried out in accordance with the permission 2012-DY-2934-00007 from the Danish Experimental Animal Committee. We limit our analysis to detections from a single adult (length = 93.2 cm) female pike that was tagged with a Thelma D-HP9 tag (30.5 mm long × 9 mm diameter, ~9 month duration) that transmitted at 71 kHz. Acoustic transmissions were programmed to occur randomly within a fixed interval of 10–30 s to reduce collisions. Our data include these random times, but other datasets do not.
2.2 Data notation
Detection data are structurally complex. They require a combination of three data types: tag and receiver metadata, and logs of detections at each receiver, such that a single observation consists of a time-stamp associated with a receiver location and a tag ID (Whoriskey et al., 2019). Detections are spatially-discrete, and can be biased based on predefined receiver locations, for example, if receivers are placed only in favourable habitats. The probability of detection varies based on many factors including time, environmental condition and distance between the receiver and tag. Thus, true animal absence cannot be measured, and the meaning of presence is dynamic because it is recorded within a changing detection range. When detection ranges overlap, a single data unit (transmission) yields multiple observations (detections).
With these idiosyncrasies in mind, we introduce the following notation. The index i will be used for a variable ordered in time. It ranges from 1 to N, where N denotes the total number of data units (transmissions), but not the total number of observations (detections). When indexed with a colon, for example, 1:N, this entails all values including and between 1 and N. The index c denotes the coordinate axes, which in our case will be Eastings or Northings. Two indices will be used for behavioural states: j, and k, and these each range from 1 to m total state values (here m will equal either 2 or 3). In a minor abuse of notation but for concision and ease of interpretability, r will be used to both denote the location of a receiver (in which case it will be subscripted by c), and to index a variable at a receiver (in which case r will be the subscript); a misconduct that is only mildly offensive because locations are unique among receivers. Finally, bold characters denote vectors and matrices.
2.3 Model definition
Term | Definition |
---|---|
Index of observations ordered in time | |
Irregularly observed time of origin of a transmission | |
Number of behavioural states | |
Indices for behavioural states | |
Coordinate axis | |
Receiver location indexed by coordinate axis; because locations are unique among receivers, is used synonymously with receiver ID | |
Behavioural state at time | |
Location of animal at time in coordinate axis | |
Observed time that a transmission arrives at a receiver | |
Predicted time that a transmission arrives at a receiver | |
Error between the observed and predicted time of arrival of an acoustic transmission at a receiver | |
A t-distribution with three degrees of freedom | |
Estimated scale parameter of | |
The distance between receiver and the unobserved true location of the animal at time . This is computed for all receivers and all transmissions | |
Temporal interval between transmissions and | |
Tag internal clock drift at time to account for variability between and | |
Variance for the random walk modelling the tag drift | |
Speed of sound. We keep this constant at 1,465 m/s | |
Diffusion parameter of the animal movement process | |
Generator matrix of the continuous-time Markov process | |
Continuous-time analogue to the transition probability matrix |
2.3.1 Measurement
We observe τr,i, the time that a transmission arrives at receiver r. This is distinguished from the time that the transmission originated at the tag, which we denote ti. We calculate the distance between the receiver location and unknown tag location , where denotes the appropriate coordinate axis, and represent this with . Then, the expected travel time of a transmission is calculated from and the speed of sound υ, and this is added to to get the expected time, , that a transmission arrives at a receiver. In some implementations of YAPS, is modelled as a Wiener process or included as data; here for simplicity, we assume that it is constant, that is, . The stochasticity of the transmission time depends on whether the random transmission interval is known or not. When known, as is the case in our implementation, is a sum of the previous transmission time , , and an internal clock drift that is modelled with a random walk on its first differences with variance . Finally, errors in the expected versus observed times of arrival can originate from multiple sources including, for example, varying aquatic conditions affecting the speed of sound or physical obstructions causing the transmissions to ‘bounce’ before reaching a receiver. This measurement error is accounted for by modelling the difference between the observed and expected transmission arrival times, , as a scaled t-distribution with three degrees of freedom () and scale parameter , as in (Baktoft et al., 2017).
2.3.2 Movement
The locations are assumed to follow a Wiener process (Equation 2) with diffusivity parameter , which emulates discrete-time random walks in continuous time.
2.3.3 Behaviour
Maximizing the likelihood of a hierarchical model similar to Equations 1-3 is difficult because of large numbers of both continuous and discrete random effects (Altman, 2007; McKellar et al., 2015). Rather than attempting to maximize the full likelihood directly, we employ the procedure outlined in Whoriskey (2021) to estimate parameters and predict random effects via iterative optimization of an HMM and SSM likelihood. We now describe the SSM and HMM likelihoods below.
2.4 State-space model likelihood
2.5 Hidden Markov Model likelihood
2.6 Model fitting
In practice, we carry out the iteration as follows. To optimize the SSM step, we require a known sequence of behavioural states. To optimize the HMM step, reasonable location values are necessary. As a result, to initialize the optimization we could either treat a randomly generated sequence of behavioural states as known, or we need to obtain initial values of the locations in continuous space. We choose the latter option, and achieve this by fitting a one-behaviour SSM (YAPS) to the observed data. A HMM according to Equation 11 (Section 2.5) is then fitted to these initial values, from which we obtain behavioural state predictions and estimates of . The behavioural state predictions are then treated as known in the SSM step (Equation 6). In Whoriskey (2021) the movement parameters were also fixed during the SSM step. However, in this implementation we achieved better performance by treating the as unknown during both the HMM and SSM steps. We run the iteration for a fixed number of steps, and following the implementation of Whoriskey (2021), assume that the parameter estimates from the iteration with the maximum (the likelihood of the SSM step evaluated at the maximum likelihood estimates of the parameters) represent the global maximum because this theoretically corresponds to the parameter set that is most likely given the observed data. In practice, all calculations are performed on the negative log scale.
We additionally note that two other optimizations must occur in order to predict the random effects. For the location and drift states, the optimization occurs iteratively within the likelihood calculations via TMB; for the behavioural states, it is implemented via the Viterbi algorithm ex post. Equations 11 and 6 show clearly that we are iterating between the maximization of two conditional likelihoods, which is a frequentist analogue to many Bayesian implementations of Markov Chain Monte Carlo simulations, where proposed samples are iteratively obtained from many conditional distributions (e.g. Parton & Blackwell, 2017).
2.7 Analysis
We fitted YAMS to approximately six days' worth of data collected on an adult female pike tracked in Hald Lake in summer, 2019. Because many detections occur for a single transmission, and because the transmission interval is often small compared to the temporal scale of the study (in our case, 10–30 s), a relatively short study duration can yield a large dataset. In our analysis, approximately 6 days of monitoring a single pike resulted in 144,625 detections from 25,000 transmissions. Given that for transmissions, there are random effects in our model, analysing a dataset of this magnitude is difficult. We therefore broke the dataset into groups of 5,000 transmissions, and fitted YAMS to each group, with both two and three behavioural states. We chose 5,000 transmissions based on previous experience and success when fitting YAPS: too few data would result in many groups for a single individual, but too many data increases the number of random effects and can make optimization more difficult. For each group, we ran the model for 10 steps.
To carry out a proper simulation study (Section 2.8), we required an estimate of the detection efficiency to mimic whether a simulated tag transmission was successfully registered by a receiver. The terms detection efficiency and detection range are often used interchangeably. Range and efficiency both describe the relationship between the probability of detection at a receiver and distance to the tag; we refer to a receiver’s range when distance is the variable of interest, and its efficiency when the probability of detection is of interest. We calculated the efficiency based on model results as follows: first, we computed all distances between each location of the predicted track and all receivers. We then binned these into groups based on 5 m intervals of distance, and calculated the proportion of receivers that registered a tag transmission. To quantify the detection efficiency, we split the data into 70% training and 30% testing datasets and fitted a binomial generalized additive model using the R package mgcv (Wood, 2017) to the proportion of successfully registered transmissions as a function of the smoothed distance between the tags and receivers.
2.8 Simulation and the snapshot principle
When fitting continuous-time HMMs, it is necessary to consider whether the snapshot principle holds, which is the assumption that the observed movement of an animal is only dependent on the active behavioural state (Patterson et al., 2017). In our implementation, we relax the assumption of the snapshot principle in order to utilize computationally efficient machinery for approximating the likelihood and predicting the behavioural states. To assess the validity of our approximation, we designed the following simulation study.
We simulated 60 tracks using the estimated parameters of the first group of 5,000 transmissions. First, we simulated 4,999 transmission intervals from a Uniform distribution with limits of 10 and 30 s, and 5,000 tag drift times. Then, we simulated the embedded Markov chain, that is, the discrete chains of holding times within states and jumps to the next state, which we used to create the sequence of 5,000 behavioural states. To test our approximation, we combined the exact behavioural switching times with our observation times (and the corresponding states at switching with our behavioural state sequence), and simulated the animal locations based on these augmented sequences. Once the full path of 5000+ locations was simulated, we removed the locations and behavioural states corresponding to the exact switching times in order to simulate the effect of the unobserved and unaccounted for behavioural switches. From here, we randomly placed the track within the lake and calculated the distances from every location to each hydrophone. To simulate detections, we used the detection efficiency model results to predict the probability that each receiver would have detected the simulated track based on these distances, then we simulated a detection at each receiver with a Bernoulli trial. Finally, we fitted YAMS to each simulated dataset, and quantified the root mean squared error (RMSE) of the parameter estimates, the RMSE of the locations and the behavioural state accuracy as the proportion of behavioural states correctly identified (see Whoriskey, 2021).
3 RESULTS
3.1 Pike dataset
We fitted both two- and three-state models to the pike dataset (Table 2). According to pseudoresidual QQ plots, both models appeared to fit well (Figure A.1, Supporting Information). Table 2 displays the estimated for each model for the full pike dataset. Increasing state numbers correspond to increasing levels of dispersion, such that larger state numbers can be interpreted as ‘faster’ movement relative to smaller state numbers. We interpret any as slow movement, as medium speed movement and as fast. These divisions were selected arbitrarily, and are used to provide an intuitive means for comparing the values of across groups. The two-state model identified a slow state for all five groups. Fast movement was identified for the first group, while the other four groups included a medium speed state. For this model, the slow state was observed along most of the track (56%–78% of the time), as determined by the activity budgets and the mean durations spent within a state (Table 2). The three-state model identified at least one slow state. Four of the five groups identified a second slow state, coupled with either a medium speed state (groups 3 and 4) or a fast state (groups 1 and 5). Group 2 identified both a medium speed state and a fast state. This model generally produced results consistent with the two-state model, for example, it also suggested that the pike spent most of its time in a slow state (noting that multiple slow states were observed for most groups; Table 2).
Group 1 | Group 2 | Group 3 | Group 4 | Group 5 | |
---|---|---|---|---|---|
Two-State Model | |||||
State 1 | 0.97 | 2.03 | 0.36 | 1.38 | 1.05 |
State 2 | 20.76 | 18.51 | 11.45 | 13.63 | 14.02 |
Three-State Model | |||||
State 1 | 0.19 | 0.74 | 0.16 | 0.23 | 0.25 |
State 2 | 3.69 | 11.88 | 1.42 | 1.39 | 3.59 |
State 3 | 24.75 | 23.70 | 13.32 | 17.03 | 27.78 |
Two-State Model | |||||
State 1 | 0.56 | 0.65 | 0.59 | 0.74 | 0.78 |
State 2 | 0.44 | 0.35 | 0.41 | 0.26 | 0.22 |
Three-State Model | |||||
State 1 | 0.27 | 0.59 | 0.38 | 0.38 | 0.63 |
State 2 | 0.41 | 0.20 | 0.29 | 0.38 | 0.26 |
State 3 | 0.32 | 0.21 | 0.33 | 0.24 | 0.11 |
Two-State Model | |||||
State 1 | 11.51 | 12.61 | 13.83 | 22.00 | 15.47 |
State 2 | 9.15 | 6.78 | 9.67 | 7.85 | 4.47 |
Three-State Model | |||||
State 1 | 4.91 | 8.39 | 8.28 | 13.85 | 15.20 |
State 2 | 5.78 | 2.78 | 3.86 | 9.28 | 4.69 |
State 3 | 11.68 | 60.95 | 8.95 | 8.27 | 8.20 |
Figure 2 reproduces the pike path coloured by behavioural state, which occurred in the Southwest portion of the lake. The pike appeared to swim faster on the outer perimeter of the lake, and slower towards the interior.

The increasing values of correspond well with the observed speeds of the animal (Figures 3 and 4). The distribution of speeds changed over data group, which can be seen from the variability in the observed ranges of speeds in Figures 3 and 4. This also corresponds with the dynamic values of Table 2. Greater segregation among states occurred in the two-state model compared to the three-state model (Figure 3; Table 2).


3.2 Simulation study
We used the GAM depicted in Figure 5 to simulate detections of an animal throughout the lake given an underlying movement path. These results showed that detection efficiency dropped nonlinearly with distance, with predicted detection probabilities of 0.74, 0.42 and 0.08 at distances of 100, 250 and 500 m from the receiver. Although other models were considered (e.g. mixed effects models to account for within-receiver variability), our model that considered the variability in detection efficiency to be constant throughout space was determined to fit the best based on cross-validation with 70% training and 30% testing datasets.

We fitted YAMS to 60 simulated tracks under both a two-state and three-state scenario. Within each simulation study, 22 of the models either falsely converged, or did not converge. These results were removed, leaving 38 model results. The two-state simulation study yielded a mean behavioural state accuracy of 0.94 (median = 0.95, Figure 6), and an average location RMSE of 1.8 m (median = 1.6 m). The three-state simulation study showed lower levels of accuracy in the behavioural state prediction, with an average proportion of 0.79 (median = 0.81) of the behavioural states being correctly classified. However, the three-state model achieved similar precision in the location state predictions, with an average RMSE of 1.8 m (median = 1.5 m, Figure 6). Parameter results were also precise, as documented by Tables B.1 and B.2 (see supplementary material).

4 DISCUSSION
This research combines the existing SSM of YAPS with a latent Markov chain and the iterative model fitting framework of Whoriskey (2021) to develop a novel SHM designed specifically for acoustic detections. With this formulation, a researcher need not depend on positioning algorithms from manufacturers, which can be expensive and contain larger amounts of error, and they can conveniently utilize the same model likelihood that predicts the location states to predict behavioural states. We demonstrated the capabilities of this model to identify multiple behavioural states of a predatory fish, and tested its accuracy using simulation studies. Several decisions were made during model fitting that impacted the results.
To maintain consistency with YAPS, we chose a relatively simple process to model animal movement with a single parameter () that governs the dispersion of an animal in any direction. More complex processes could yield more details on how movement evolves through time. For example, using an Ornstein–Uhlenbeck process on the velocities, rather than a Wiener process on the locations, could allow for estimation of drift or home range tendencies and autocorrelation within the track (Johnson et al., 2008; Pedersen & Weng, 2013). Alternatively, modelling the step lengths and turning angles as separate continuous-time processes could provide more easily accessible interpretations of movement (Parton & Blackwell, 2017). In our pike analysis, we found that the Wiener process was sufficient to identify behavioural states that corresponded well with differences in the observed speed of the animal (Figure 4). Future research might benefit from investigating the utility of more complex movement models, however, other researchers have rationalized the use of more simplistic models in order to provide quick and accurate results (Jonsen et al., 2020).
When fine-scale positioning of acoustically tagged animals is a primary research goal, shorter transmission intervals and extensive receiver coverage over the entire study area will often result in extremely large datasets. In our case, approximately 6 days' worth of observation on a single pike yielded 25,000 transmissions. Because computational limitations prevented us from fitting a single model to the full dataset, we split the dataset into five groups of 5,000 transmissions each. Future studies should consider that different groupings (e.g. 10,000 instead of 5,000) could yield varying results and therefore provide different interpretations of the behavioural states. In addition, for a classic telemetry study, which often comprises >10 animals for months at a time, our implementation could take days to weeks to complete depending on available computing resources. We suspect these limitations will lessen with increasing development of computer hardware and software, thereby enhancing the analysis of increasingly large datasets.
YAMS requires an ancillary stochastic step that we did not describe because it is unaltered from the original YAPS implementation. Both acoustic receivers and tags contain time keeping mechanisms that experience drift. Small values of drift, even microseconds, can lead to error in position estimates on the order of metres because transmissions travel at the speed of sound. Although we can account for drift in the tag clocks within YAPS/YAMS, accounting for drift in the receiver clocks must also be achieved, and we did this prior to fitting the movement models by using an SSM (via YAPS) to synchronize the receiver clocks (Baktoft et al., 2019). Receiver synchronization, whether it is achieved by an SSM or other tactic, is typically required regardless of the method used to generate positional estimates (e.g. Smith, 2013). The reader is referred to Baktoft et al. (2019) for a guide on synchronizing the receiver clocks.
In our continuous-time HMM formulation we are allowing the behavioural process to switch between sampling times. However, in our movement process, we specify that an animal’s location at time is only dependent on the behavioural state at time . For those occasions where the behavioural state switches between sampling times, the movement-behaviour dependence is not precise because the movement of the animal depends on the behaviours at both the current and the previous sampling times, and (snapshot principle; Patterson et al., 2017). To precisely model movement when the switches occur, the times of switches would have to be predicted. This has been done in Bayesian formulations (Michelot & Blackwell, 2019; Parton et al., 2017). Instead, here we relaxed the snapshot assumption such that we could use HMM and SSM likelihood machinery available through TMB to approximate our likelihood around the times of switches, and take advantage of this platform’s relative speed compared to sampling techniques (Auger-Méthé et al., 2016; Whoriskey et al., 2017). However, we specifically designed our simulation studies to measure the error in our model fitting procedure incurred in part by this approximation. The results showed high levels of accuracy for the behavioural and location states, suggesting that our model is accurate despite relaxing the snapshot assumption. This is likely because the temporal resolution of our data (observations occurring randomly every 10–30 s) is fine relative to the scale of the behavioural states that we are predicting. Longer transmission intervals might incur larger amounts of error, therefore future researchers who use YAMS should carefully consider the temporal scales of their inferred behaviours relative to their observations and re-evaluate the snapshot principle with simulation studies when necessary.
The simulation studies showed a high level of accuracy attained by both the two- and three-state models. We did have to omit approximately one third of the simulated tracks based on false or lack of convergence. This was surprising because we encountered this very infrequently during model fitting on the real data (we had false convergence for 1 of 10 steps when fitting the two-state model to the first group, and for 2 of 10 steps when fitting the three-state model to the fourth group), but may be explained by the fact that sometimes simulations cannot fully capture the variability inherent to real-life scenarios. When it is not possible to eliminate problematic tracks (e.g. when analysing real data), researchers may successfully fit models with TMB if they change starting values either by adding a small amount of random noise or by selecting a new set entirely. The 15% decrease in behavioural state accuracy in the three-state model relative to the two-state model could be explained by the fact that the three-state model had decreased separation in the estimated values of which drive the state classification. Our case might also be explained by the fact that the three-state model did involve considerably more switches (136–209) among behavioural states than the two-state model (92–136). Although it seems reasonable that more states and switches could result in larger amounts of error, it is also possible that this error resulted from our approximation in likelihood around the times of switches. However, we are unable to answer this question because our simulation studies were unable to separate error inherent to the behavioural state prediction from error specifically incurred by relaxing the snapshot assumption.
We now offer interpretations of the two-state rather than the three-state model with respect to pike ecology. We base this decision on (a) the average increase of 15% behavioural state accuracy in the two-state model, (b) the increased segregation of the two-state movement parameters compared to the three-state model and (c) the dynamic nature of the behavioural states among groups. Importantly, we remind the readers that the states we have predicted are mathematically distinct, but can only be interpreted with consideration of previous knowledge of the study animal and system.
Pike are commonly referred to as sit-and-wait predators (Eklöv, 1997). Laboratory experiments have shown that pike often remain stationary, watching and then ambushing prey when they come within range (Harper & Blake, 1991; Savino & Stein, 1989). They additionally suggest that pike will often track their prey slowly with an elongated posture before attacking (Harper & Blake, 1991). During prey capture attempts, acceleration can reach up to 96 m/s2 (Harper & Blake, 1991). During escape, acceleration can be even higher (120 m/s2; Harper & Blake, 1991), thus, both behaviours are energetically costly (Frith & Blake, 1995). In this study, we were able to classify slow, medium and fast rates of movement of our single northern pike. It is unlikely that our fast state identified acute hunting or predator avoidance for three reasons. First, these events are nearly instantaneous, and our observed speeds did not reach those that have been previously documented for these behaviours. Here, our maximum observed speed was m/s, compared to maximum speeds of and m/s for predatory and escape behaviours respectively (Harper & Blake, 1991). Second, the fast behaviour had a large variability in observed speed, including values that might suggest stationary movement (Figure 3). Third, the fast behaviour was persistent, that is, the pike was observed to remain within this behaviour for an extended period of time (≈5–10 min; Table 2), therefore it is unlikely that this animal was consistently hunting or avoiding predators during these time periods given the energetic cost (Frith & Blake, 1995). It is more likely that the fast behaviour is documenting exploratory travel throughout the lake, which could include either of the above rapid response behaviours.
We offer two explanations for the slow behaviour. This state did not identify stationary behaviour by itself, as we documented speeds within this state >0 m/s, and the pike still covered distance while within this state (Figure 2). However, it is possible that this state identifies a composite of stationary behaviour and slow tracking of prey items before prey capture attempts. Alternatively, it is possible that this pike adopts a slow, steady speed at regular intervals and for a majority of its time (Table 2) to conserve energy that may be needed for rapid acceleration at later, opportunistic times. This has been proposed to explain an observed high proportion of low activity in another ambush predator, the great barracuda Sphyraena barracuda (O’Toole et al., 2010). Fine-scale accelerometry data could help to distinguish between these two possible behaviours.
Our implementation works well when receivers are closely spaced relative to their detection efficiency, and when the study animals remain within the receiver array. We developed YAMS under the assumption that the random transmission intervals are known and that the speed of sound is constant, however, it should be easily extendible to cases including unknown intervals or stochastically varying speed of sound, as these implementations are already available in YAPS. Furthermore, other HMM functionality could be incorporated, for example the inclusion of covariates to assess the influence of environmental variables on behavioural state switching (Michelot et al., 2016). We hope that YAMS will help unlock the utility of HMMs for acoustic telemetrists, such that they can gain insight into the underlying drivers of movement and further our understanding of, for example, foraging ecology (Nowak et al., 2020), differences among colonies (Dean et al., 2013; Whoriskey, 2021), interactions of predators with fisheries (van Beest et al., 2019) and the effects of potentially harmful disturbances (deRuiter et al., 2017).
ACKNOWLEDGEMENTS
This work would not have been possible without the support of several parties. J.M.F. was funded by a Discovery grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) as well as a Canadian Statistical Sciences Institute (CANSSI) Collaborative Research Team Project. K.W. was supported through an NSERC Canada Graduate Scholarship as well as a Killam Doctoral Scholarship. H.B. was funded by the Danish rod and net fish licence funds. E.L. was supported by a Vanier Canada Postgraduate Scholarship. This research was also supported by the Ocean Tracking Network, and was a direct result of the internationally collaborative ideasOTN committee and the Telemetry Workshop Series that took place February 2020. Finally, we thank two anonymous reviewers for their constructive comments and well-placed advice.
CONFLICT OF INTEREST
The authors have no conflict of interest.
AUTHORS' CONTRIBUTIONS
K.W. conceived the idea for this research and developed the method closely with H.B.; K.W. led the analysis and writing. Throughout, guidance on the method development and analysis was given by C.F., R.J.L., E.L., J.B. and J.M.F. All authors significantly contributed to the writing of the manuscript.
Open Research
PEER REVIEW
The peer review history for this article is available at https://publons.com/publon/10.1111/2041-210X.13812.
DATA AVAILABILITY STATEMENT
All code and data files are currently available at github.com/kimwhoriskey/yams, and have been permanently archived on Zenodo 10.5281/zenodo.5914936 (Whoriskey et al., 2022). A full implementation of YAMS is planned as an update to YAPS in 2022.