Volume 10, Issue 8
APPLICATION
Free Access

Conducting social network analysis with animal telemetry data: Applications and methods using spatsoc

Alec L. Robitaille

Corresponding Author

E-mail address: robit.alec@gmail.com

Department of Biology, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

Correspondence

Alec L. Robitaille

Email: robit.alec@gmail.com

Search for more papers by this author
Quinn M. R. Webber

Cognitive and Behavioural Ecology Interdisciplinary Program, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

Search for more papers by this author
Eric Vander Wal

Department of Biology, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

Cognitive and Behavioural Ecology Interdisciplinary Program, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

Search for more papers by this author
First published: 24 May 2019
Citations: 3
The software package spatsoc, developed as part of this research effort, was extensively reviewed and approved by the rOpenSci project (https://ropensci.org). A full record of the review is available at: https://github.com/ropensci/spatsoc

Abstract

  1. We present spatsoc, an r package for conducting social network analysis with animal telemetry data.
  2. Animal social network analysis is a method for measuring relationships between individuals to describe social structure. Proximity‐based social networks are generated from animal telemetry data by grouping relocations temporally and spatially, using thresholds that are informed by the characteristics of the species and study system.
  3. spatsoc fills a gap in r packages by providing flexible functions, explicitly for animal telemetry data, to generate edge lists and gambit‐of‐the‐group data, perform data‐stream randomization, and generate group by individual matrices.
  4. The implications of spatsoc are that current users of animal telemetry or otherwise georeferenced data for movement or spatial analyses will have access to efficient and intuitive functions to generate social networks.

1 INTRODUCTION

Animal social network analysis is a method for measuring the relationships between individuals to describe social structure (Croft, James, & Krause, 2008; Farine & Whitehead, 2015; Pinter‐Wollman et al., 2014; Wey, Blumstein, Shen, & Jordán, 2008). Association networks are built from a set of observed elements of social community structure and are useful to understand a variety of ecological and behavioural processes, including disease transmission, interactions between individuals, and community structure (Pinter‐Wollman et al., 2014). Among the most common types of social network data collection is gambit‐of‐the‐group, where individuals observed in the same group are assumed to be associating or interacting (Franks, Ruxton, & James, 2010). Similar to gambit‐of‐the‐group, proximity based social networks (PBSNs) are association networks based on close proximity between individuals (Spiegel, Leu, Sih, & Bull, 2016). PBSNs rely on spatial location datasets that are typically acquired by georeferenced biologging methods such as radio‐frequency identification tags, radiotelemetry, and global positioning system (GPS) devices (hereafter, animal telemetry).

Biologging using GPS devices allow simultaneous spatiotemporal sampling of multiple individuals in a group or population, thus generating large datasets which may otherwise be challenging to collect. The advent of biologging technology allows researchers to study individuals of species that range across large areas, migrate long distances, or spend time in inaccessible areas (Cagnacci, Boitani, Powell, & Boyce, 2010; Cooke et al., 2013; Hebblewhite & Haydon, 2010). Moreover, the recent increase in the number of studies using GPS telemetry to study movement ecology (Kays, Crofoot, Jetz, & Wikelski, 2015; Tucker et al., 2018) indicates the potential for a large number of existing datasets that may be retro‐actively analysed to test a priori hypotheses about animal social structure. As animal telemetry data have become more accessible and available at a fine scale, a number of techniques and methods have been developed to quantify various aspects of animal social structure (Webber & Vander Wal, 2018). These include dynamic interaction networks (Long, Nelson, Webb, & Gee, 2014), PBSNs (Spiegel, Sih, Leu, & Bull, 2017) and the development of traditional randomization techniques to assess non‐random structure of PBSNs constructed using animal telemetry data (Spiegel et al., 2016). Despite the recent increase in the number of studies using animal telemetry data and GPS relocation data (Webber & Vander Wal, 2019), there is no comprehensive r package that generates PBSNs using animal telemetry data.

Here, we present spatsoc (v0.1.9), a package developed for the r programming language (R Core Team, 2018) to (1) convert animal telemetry data into gambit‐of‐the‐group format to build PBSNs, (2) implement data‐stream social network randomization methods of animal telemetry data (Farine & Whitehead, 2015; Spiegel et al., 2016), and (3) provide flexible spatial and temporal grouping of individuals from large datasets. Animal telemetry data can be complex both temporally (e.g. data can be partitioned into monthly, seasonal, or yearly segments) and spatially (e.g. subgroups, communities, or populations). Functions in spatsoc were developed taking these complexities into account and provide users with flexibility to select relevant parameters based on the biology of their study species and systems and test the sensitivity of results across spatial and temporal scales.

2 FUNCTIONS

The spatsoc package provides functions for using animal telemetry data to generate PBSNs. Relocations are converted to gambit‐of‐the‐group using grouping functions which can be used to build PBSNs. Alternatively, edge functions can be used to generate edge lists to build PBSNs. Raw data streams can be randomized where animal telemetry data are swapped between individuals at hourly or daily scales (Farine & Whitehead, 2015), or within individuals using a daily trajectory method (Spiegel et al., 2016).

3 GROUPING

Gambit‐of‐the‐group data are generated from animal telemetry data where individuals are grouped based on temporal and spatial overlap. The spatsoc package provides one temporal grouping function:
  1. group_times groups animal telemetry relocations into time groups (Figure 1). The function accepts date time formatted data and a temporal threshold argument. The temporal threshold argument allows users to specify a time window within which relocations are grouped, for example: 5 min, 2 hr, or 10 days.
image
Temporal grouping with group_times. (a) A full temporal data stream of regular fixes at 2 h intervals for four individuals (example data described in Table 1). (b) An example showing the temporal deviation around the set fix rate. Temporal grouping with a threshold of 5 min groups these relocations to the nearest 5 min interval. Times within the temporal threshold, for example 5 min in this case, are grouped together. (c) temporal grouping with a threshold of 8 hr showing the relocations being grouped to the nearest 8 hr interval. (d) temporal grouping with a threshold of 10 days with all relocations being grouped in 10‐day chunks

group_times compares the date and time of each relocation to a regular interval defined by the temporal threshold. For example, a 5‐minute threshold will compare the date and time of each relocation to 5‐minute time intervals throughout each day. Each relocation is grouped to the nearest time interval at a maximum temporal distance of half the threshold before or past the time interval.

The spatsoc package provides three spatial grouping functions:
  1. group pts measures the geographic distance between animal telemetry relocations within each time group based on a spatial threshold provided by the user (Figure 2). A distance matrix is constructed to measure the distance between all individuals. The threshold is used to binarize the distance matrix and the connected components are labelled to form spatial groups. The connected components represent individuals within the threshold distance from one another. We apply the chain rule (Croft et al., 2008) where 3 or more individuals that are all within the defined threshold distance of at least one other individual are considered in the same group. For point based spatial grouping with a distance threshold that does not use the chain rule, see edge_dist below.
  2. group_lines groups overlapping movement trajectories generated from animal telemetry data (Figure 3). Movement trajectories for each individual within each time group, for example, 8 hr, 1 day, or 20 days, are generated and grouped based on spatial overlap of lines produced from trajectories. If a spatial distance threshold is provided, trajectories are buffered by this distance before spatial overlap.
  3. group_polys generates and groups overlapping home ranges using kernel utilization distributions or minimum convex polygons generated in adehabitatHR of individuals and optionally returns a measure of proportional area overlap (Figure 4). Home ranges are generated for each individual in each timegroup, providing efficient comparison of home ranges through time, for example, multiple days, seasons, or years.
image
Point based spatial grouping with group_pts. (a) Three relocations for 4 individuals in 3 time groups (example data described in Table 1). The relocation in the second timegroup for all individuals is buffered, to depict the distance threshold (in this case 50 m) to generate spatial groups. The temporal threshold used is 5 min (see Figure 1b). (b) A distance matrix of relocations for all 4 individuals at timegroup 2 where highlighted rows are pairwise distances that meet the user defined criteria for spatial grouping, that is, they are less than the spatial threshold. (c) The connected components showing the chain rule implementation of point based distance grouping with group_pts. The connected components show individuals E, F, and G grouped (group 2 coloured blue), despite individual F and G being further apart than the spatial threshold, since they were both within the threshold distance from E. Individual H is assigned a group on their own, since they are not within the spatial threshold of any other individuals (group 9 coloured pink). (d) Output spatiotemporal groups from group_pts showing individuals (“ID”), timegroups (“timegroup”), and spatiotemporal groups (“group”)
image
Line based spatial grouping with group_lines. (a) Three daily trajectories for four individuals generated using a time threshold of 1 day (see Figure 1c) and group_lines (example data described in Table 1). A spatial threshold of 50 m is used, represented by the buffered portions around each individual's trajectory on the second day, or timegroup 2. (b) Output spatial groups from group_pts showing individuals (“ID”), timegroups (“timegroup”), and spatiotemporal groups (“group”)
image
Home range based spatial grouping with group_polys. (a) Home ranges for four individuals generated using a temporal threshold of 30 days (see Figure 1d). group_pts generates and groups overlapping home ranges of individuals. It either returns (b) binary overlap or (c) a measure of proportional area containing the area of overlap (km2) and proportion of overlap among individuals

For spatial grouping functions, individuals that are not within the distance threshold, or that do not overlap with any other individuals are assigned to a group on their own.

4 EDGE LISTS

The spatsoc package provides two edge list generating functions:
  1. edge_nn calculates the nearest neighbour to each individual within each time group (Figure 5). If the optional distance threshold is provided, it is used to limit the maximum distance between neighbours. edge_nn returns an edge list of each individual and their nearest neighbour.
  2. edge_dist calculates the geographic distance between animal telemetry relocations within each time group and returns all paired relocations within the spatial threshold (Figure 5). edge_dist uses a distance matrix like group_pts, but, in contrast, does not use the chain rule to group relocations. Instead, it returns an edge list of each individual and all others within the spatial distance threshold.
image
Edge list generating functions edge_nn and edge_dist. Panels show relocations and output edge lists for 4 individuals (E, F, G, H) for one timegroup from example data described in Table 1. Note the distances between individuals shown here is presented in Figure 2 panel B. (a and b) show edges generated with edge_dist. Edges between individuals are generated if the distance between relocations is within the spatial threshold. (c and d) show edges generated with edge_nn. Edges are created by identifying the nearest neighbour to each individual in each timegroup. Optionally, users may specify a maximum distance within which to consider a nearest neighbour relevant

For edge list generating functions, individuals that are not within the distance threshold, or that do not have a nearest neighbour (or within the distance threshold if provided), are returned as NA.

5 RANDOMIZATIONS

Randomization procedures in social network analysis are important to test assumptions of spatial and temporal non‐independence of social association data (Farine & Whitehead, 2015). Data‐stream randomization is the recommended randomization technique for social network users (Farine & Whitehead, 2015) and involves swapping individuals and group observations within or between temporal groups and individuals (Farine, 2017). Animal telemetry data have inherent temporal structure and is well suited to randomization methods. The spatsoc package provides three data‐stream randomization methods:
  1. Step ‐ randomizes identities of animal telemetry relocations between individuals within each time step.
  2. Daily ‐ randomizes daily animal telemetry relocations between individuals, preserving the order of time steps.
  3. Trajectory ‐ randomizes daily trajectories generated from animal telemetry relocations within individuals (Spiegel et al., 2016).

The randomizations function returns the input data with random fields appended, ready to use by the grouping functions or to build social networks. Step and daily methods return a “randomID” field that can be used in place of the ID field and the trajectory method returns a “randomDatetime” that can be used in place of the datetime field. The randomizations function in spatsoc allow users to split randomizations between spatial or temporal subgroups to ensure that relocations are only swapped between or within relevant individuals.

6 USING SPATSOC IN SOCIAL NETWORK ANALYSIS

spatsoc is integrated with social network analysis in r to generate and randomize PBSNs. First, users will generate temporal groups with group_times. Next, users will generate PBSNs from spatial groups:
  1. Generate gambit‐of‐the‐group data with spatial grouping functions (group pts, group_lines, and group_polys)
  2. Generate group by individual matrices (get_gbi)
  3. PBSN data‐stream randomization (randomizations)
or edge lists:
  1. Generate edge lists (edge_dist and edge_dist)
  2. PBSN data‐stream randomization (randomizations)

Before spatiotemporal grouping or edge list generation, users should first determine relevant temporal and spatial grouping thresholds.

7 SELECTING AND EVALUATING SPATIAL AND TEMPORAL THRESHOLDS

Functions provided by spatsoc emphasize flexibility to allow users the ability to modify functions to better suit their specific use case. The temporal threshold argument of group_times accepts units of minutes, hours, or days to generate temporal groups at different scales. The spatial threshold defines the distance used to generate spatial groups and edge lists. The spatial and temporal thresholds used for generating PBSNs with spatsoc must be considered carefully and we recommend the thresholds used are based on the nuances of the animal telemetry data, study species, system, and specific research questions. Despite this, there are no hard and fast rules for selecting thresholds for spatiotemporal grouping (but see below for recommendations). Evaluating candidate thresholds is recommended and has been shown to provide valuable insights for selecting temporal (Psorakis et al., 2015) and spatial (Davis, Crofoot, & Farine, 2018) thresholds.

It is important that the temporal threshold matches the spatial function used. In the case of point based spatial grouping and edge list generating functions, the temporal threshold must be at least less than the fix rate of the telemetry device. If not, an individual may have multiple relocations in a timegroup and potentially grouped with itself. The temporal threshold for these functions will likely be in units of minutes or hours. For line and polygon based spatial grouping, the temporal threshold will necessarily encompass multiple relocations for each individual. Lines must be built with at least 2 points and there are specific requirements for number and distribution of relocations for building home ranges (Cumming & Cornélis, 2012; Laver & Kelly, 2008).

While, the spatial and temporal thresholds are informed by the biology of the study species and research questions, there are a number of behavioural, morphological, and ecological factors that could influence threshold distance. For example, these include, but are not limited to, body size, daily movement rate, communication distance (Cameron & Toit, 2005), gregariousness (Godde, Humbert, Côté, Réale, & Whitehead, 2013), and degree of fission‐fusion (Haddadi et al., 2011). Some empirical examples from the literature include 5 body lengths for white‐faced capuchin monkeys Cebus capucinus (Crofoot, Rubenstein, Maiya, & Berger‐Wolf, 2011), within arm's reach for chimpanzees Pan troglodytes (Fraser, Schino, & Aureli, 2008), 2 m for sleepy lizards Tiliqua rugosa (Leu, Bashford, Kappeler, Michael, & Bull., 2010), 100 m for bison Bison bison (Merkle, Sigaud, & Fortin, 2015). Leu et al. (2010) also measured the median GPS device precision to estimate an effective range of 2–26 m when using a spatial threshold of 2 m. In summary, it is clear that smaller bodied species have shorter threshold distances than larger bodied species, while highly active and gregarious species, including most primates, tend to also have shorter threshold distances.

Finally, spatsoc can be used to compare networks generated with different grouping methods across a range of spatial and temporal thresholds. Davis et al. (2018) compared association networks generated from wild baboon Papio anubis telemetry data using spatial thresholds with the chain rule (as in group_pts), spatial thresholds without the chain rule (as in edge_dist), and nearest neighbours (as in edge_nn). For example, Castles et al. (2014) compared proximity networks of chacma baboons Papio ursinus built with the chain rule (as in group pts) and without (as in edge_dist) and using nearest neighbours with a maximal distance (as in edge_nn).

8 GENERATING NETWORKS

Here, we will provide an example of point based spatial grouping with spatsoc’s example caribou telemetry data (Table 1). The data consist of 10 individuals with relocations recorded every 2 hr. The coordinates “X” and “Y” are in units of meters and the coordinate system is UTM Zone 21N.

Table 1. Expected data input for spatsoc; the relocations for each individual with a timestamp column
ID X Y Datetime
E 701672 5504286 2016‐11‐01 00:00:51
F 705583 5513813 2016‐11‐01 00:00:27
G 699636 5509635 2016‐11‐01 00:00:48
H 701724 5504325 2016‐10‐31 23:59:56
E 701656 5504196 2016‐11‐01 01:59:58
F 706625 5514043 2016‐11‐01 02:00:23
G 699369 5509699 2016‐11‐01 02:00:56
H 701648 5504276 2016‐11‐01 02:00:01
E 701688 5504266 2016‐11‐01 04:00:25
F 706793 5514015 2016‐11‐01 04:00:18
G 699383 5509700 2016‐11‐01 04:00:43
H 701607 5504291 2016‐11‐01 04:00:54
  • These rows are a subset from the package's example caribou movement data of 10 individuals collected every 2 hr. The individual identifier (“ID”) and timestamp (“datetime”) columns are character type and the coordinates (“X” and “Y”) are numeric. This example shows the first three relocations for four individuals (E, F, G, and H).

In this case, we will use a temporal threshold of 5 min and a spatial distance threshold of 50 m given the size and behaviour of the study species (Peignier et al., 2019). The combination of spatial and temporal thresholds means that any individuals within 50 m of each other within 5 min will be assigned to the same group. Please note that spatsoc is designed to work with the data.table package, specifically in the following example for reading the input data and casting the datetime column from character to date time formatted, as well as internally in spatsoc functions.

image

After the temporal and spatial grouping is completed with group_times and group pts, a group by individual matrix is generated (described by Farine and Whitehead (2015)). A group by individual matrix forms columns of individuals and rows of groups and a boolean will indicate membership of each individual to a group.

image

After generating the group by individual matrix, it is passed directly to asnipe, the animal social network package (Farine, 2013), to generate a proximity based social network. Note, in this example we use the simple ratio index (SRI) as an association index because all individuals are correctly identified and observed at each relocation event (i.e. the equivalent to an observational period for networks generated using focal observations).

image

9 DATA‐STREAM RANDOMIZATION

To perform network data‐stream permutations, the randomizations function is used to permute spatial and temporal groupings and rebuild PBSNs at each iteration. In this example, we use the “step” method to randomize between individuals at each time step for 500 iterations. The output randStep contains the observed and randomized data and can subsequently be used to generate group by individual matrices, networks, and calculate network metrics. An extended form of this example is provided in the vignette “Using spatsoc in social network analysis” (see Resources).

image

The splitBy argument can be used in the randomizations function (as well as edge list generating and spatial grouping functions) to delineate spatial, for example, groups or populations, or temporal segments of data; for example, weekly, monthly, or yearly, by which PBSNs will be generated. For example, in large datasets with individuals in two distinct populations with data over many years, users may use the splitBy argument to generate PBSNs for each population‐by‐year combination as opposed to generating each PBSN separately.

10 IMPLICATIONS

spatsoc represents a novel integration of tools for generating PBSNs from animal telemetry data. The grouping and randomization functions allow users to efficiently and rapidly generate a large number PBSNs within the spatsoc environment. spatsoc will be of interest and use to a wide range of behavioural ecologists who either already use social network analysis or those who typically work with GPS relocation data but are interested in becoming social network users. We advocate for the use of spatsoc in conjunction with the most recent “how to” on social network analysis (Farine & Whitehead, 2015) as well as other r packages, such as asnipe (Farine, 2013) and igraph (Csárdi & Nepusz, 2006), to facilitate greater sharing of computational and statistical efficiencies and ideas for users of social network analysis.

11 RESOURCES

spatsoc is a free and open source software available on CRAN (stable release) and at https://github.com/ropensci/spatsoc (development version). It is licensed under the GNU General Public License 3.0. spatsoc depends on other R packages: data.table (Dowle & Srinivasan, 2018), igraph (Csárdi & Nepusz, 2006), rgeos (Bivand & Rundel, 2018), sp (Bivand, Pebesma, & Gomez‐Rubio, 2013), and adehabitatHR (Calenge, 2006). Documentation of all functions and detailed vignettes (including “Introduction to spatsoc”, “Frequently asked questions”, and “Using spatsoc in social network analysis”) can be found on the companion website at spatsoc.robitalec.ca. Development of spatsoc welcomes contribution of feature requests, bug reports, and suggested improvements through the issue board at https://github.com/ropensci/spatsoc/issues.

12 FUTURE DIRECTIONS

In the future, we intend on producing vignettes which highlight the role of spatsoc for generating social networks for other types of data collection commonly used in social network analysis. For example, data collected using passive‐integrated transponders (e.g. Aplin et al., 2013) is increasingly being used to generate animal social networks (Webber & Vander Wal, 2019) and spatsoc could represent a novel and computationally efficient way to generate social networks for large PIT‐tag datasets. The basic principles of spatsoc and grouping functions can be applied to other data types, including PIT‐tags, as long as both spatial and temporal information are known. We are also developing additional grouping methods including dyadic grouping and clustering methods. The dyadic grouping method will extract multiple simultaneous relocations for a dyad through time (e.g. for similar application see Lesmerises, Johnson, & St‐Laurent, 2018) and will have applications for collective and coordinated movement of dyads. Meanwhile, the clustering method will identify spatially and temporally clustered relocations for individuals, or groups of individuals, and could have applications for identifying preferred habitats for groups as well as locations of scavenging or predation (e.g. for similar applications see Knopff, Knopff, Warren, & Boyce, 2009; Kermish‐Wells, Massolo, Stenhouse, Larsen, & Musiani, 2018; Cristescu, Stenhouse, & Boyce, 2014).

ACKNOWLEDGEMENTS

We thank all members of the Wildlife Evolutionary Ecology Lab, including Juliana Balluffi‐Fry, Sana Zabihi‐Seissan, Erin Koen, Michel Laforge, Christina Prokopenko, Julie Turner, Levi Newediuk, Richard Huang, and Chris Hart for their comments on previous versions of this manuscript. We thank Michel Robitaille for comments on the French version of the abstract. We thank Tyler Bonnell, Martin Leclerc, and Shane Frank for testing the package ahead of its release as well as two anonymous reviewers for comments that greatly improved the manuscript and the package. We also thank the rOpenSci organization for their package on‐boarding process including rOpenSci reviewers, Priscilla Minotti and Filipe Teixeira, and editor, Lincoln Mullen, for their code review, which contributed to improving this package. Funding for this study was provided by a Vanier Canada Graduate Scholarship to QMRW and a NSERC Discovery Grant to E.V.W.

    AUTHORS’ CONTRIBUTIONS

    A.L.R., Q.M.R.W., and E.V.W. conceived of the original package concept. A.L.R. developed the package. A.L.R and Q.M.R.W. drafted the manuscript and all co‐authors contributed critically to the drafts and gave final approval for publication.

    DATA ACCESSIBILITY

    All data and code used to produce figures are available on GitHub at https://github.com/robitalec/spatsoc-application-paper and on Zeonodo at https://doi.org/10.5281/zenodo.2824869. The data are also included with the package and can be imported with:

    image

    CITATION

    Users of spatsoc should cite this article directly. A formatted citation and BibTex entry is provided in r:

    image

      Number of times cited according to CrossRef: 3

      • Is less more? A commentary on the practice of ‘metric hacking’ in animal social network analysis, Animal Behaviour, 10.1016/j.anbehav.2020.08.011, 168, (109), (2020).
      • Trade‐offs with telemetry‐derived contact networks for infectious disease studies in wildlife, Methods in Ecology and Evolution, 10.1111/2041-210X.13355, 0, 0, (2020).
      • Analysis of temporal patterns in animal movement networks, Methods in Ecology and Evolution, 10.1111/2041-210X.13364, 0, 0, (2020).