Topographic path analysis for modelling dispersal and functional connectivity: Calculating topographic distances using the topoDistance r package
Abstract
- Estimating biologically meaningful geographic distances is essential for research in disciplines ranging from landscape genetics to community ecology. Topographically correcting distances to account for the total overland distance between locations imposed by topographic relief provides one method for calculating geographic distances that account for landscape structure.
- Here, I present topoDistance, an r package for calculating shortest topographic distances, weighted topographic paths and topographic least cost paths (LCPs). Topographic distances are calculated by weighting the edges of a graph by the hypotenuse of the horizontal and vertical distances between raster cells and then finding the shortest total path between cells of interest. The package also includes tools for mapping topographic paths and plotting elevation profiles.
- Examples from a species with moderate dispersal abilities, the western fence lizard, inhabiting a topographically complex landscape, Yosemite National Park (USA), demonstrate that topographic distances can vary significantly from straight-line distances, and topographic LCPs can trace very different routes from LCPs and shortest topographic paths.
- Topographic paths and distances are broadly useful for modelling geographic isolation resulting from dispersal limitation for organisms that interact with the topographic structure of a landscape during movement and dispersal.
1 INTRODUCTION
Calculating biologically meaningful geographic distances is essential for research in a wide variety of disciplines in ecology and evolution, ranging from landscape genetics to population biology to movement ecology. Straight-line Euclidean distances are reasonable approximations of geographic separation in very few cases, and the increasing availability of GIS data has led organismal biologists to utilize geospatial methods for distance estimation (Calabrese & Fagan, 2004; Fischer & Lindenmayer, 2007; Ray, Lehmann, & Joly, 2002). For example, resistance-based distances, like those obtained through least cost path (LCP) analysis (Wang, Savage, & Bradley Shaffer, 2009) or circuit theory analysis (McRae, 2006), account for differential resistance to movement across heterogeneous landscapes and predict population isolation better than Euclidean distances in a wide range of systems (Landguth, Cushman, Murphy, & Luikart, 2010; McRae & Beier, 2007; Wang, Glor, & Losos, 2013). Resistance surfaces can be hard to parameterize, however, especially for systems with limited data and few a priori expectations (Peterman, 2018; Spear, Balkenhol, Fortin, Mcrae, & Scribner, 2010). Another metric, which does not require any parameterization and can provide useful approximations of geographic isolation and functional connectivity (the effective movement of individuals between points; Tischendorf & Fahrig, 2000), is topographic distance (Goldberg & Waits, 2010; Murphy, Dezzani, Pilliod, & Storfer, 2010; Spear, Peterson, Matocq, & Storfer, 2005; Wang, 2013).
Topographic distances account for the additional distance, beyond horizontal distance, imposed by topographic relief and, therefore, capture the full overland distance an organism must move between geographic locations. Topographic distances have been shown to correlate strongly with genetic distances (e.g. Goldberg & Waits, 2010; Murphy, Evans, & Storfer, 2010; Spear et al., 2005; Steele, Baumsteiger, & Storfer, 2009; Wang, 2009) and with differences in community composition (e.g. Cañedo-Argüelles et al., 2015; Dong et al., 2016; Glassman, Wang, & Bruns, 2017; Razeng et al., 2016) in a variety of systems. In both cases, topographic paths provide realistic approximations of dispersal-driven geographic connectivity mediating turnover in genetic or community composition. Several studies have determined topographic distances to be important for identifying dispersal and migration corridors (e.g. Dickson, Jenness, Enterprises, & Beier, 2005; Goljani, Kaboli, Karami, Ghodsizadeh, & Nourani, 2012), and other work has found that they reflect field-based measurements of dispersal (Wang & Shaffer, 2017).
Here, I introduce topoDistance v1.0.1, an r package for calculating topographic distances and identifying topographic paths. Without a convenient tool in r, researchers have primarily relied on separate GIS software (e.g. ArcGIS) to calculate topographic distances, which may require licensed software or custom coded solutions. Moreover, the calculation of shortest topographic paths can be cumbersome, leading many to estimate topographic distances as straight-line distances corrected for topographic relief, which does not accurately reflect biological movement between geographic locations on topographically complex landscapes (Goldberg & Waits, 2010; Goljani et al., 2012; Spear et al., 2005; Tonkin et al., 2017). In addition to calculating shortest topographic paths, topoDistance also estimates topographic LCPs (Wang & Summers, 2010), providing a valuable addition to the resistance-based distance approach.
2 FUNCTIONALITY
The topoDistance package provides functions for generating topographic movement surfaces and calculating shortest topographic distances, topographic least cost distances and weighted topographic distances between geographic locations using digital elevation model (DEM) raster layers. After identification of topographic paths, topoDistance can also plot the paths on shaded relief maps and plot topographic cross-section profiles.
topoDistance also provides convenient tools for plotting the topographic paths resulting from these analyses. The topoPathMap function produces a shaded relief, or hillshade, map with the topographic paths and allows users to specify the path colour and width, point colour and size, and angle and direction of shading. Finally, the topoProfile function can extract the elevation for points along a topographic path to provide a topographic cross section, or elevation profile, of the path. The cross sections can be drawn using base r plotting or as an interactive graph through the plotly package (Sievert et al., 2019).
To validate the functions contained in topoDistance, I performed a series of basic tests on simplified landscapes (Appendix S1). These rasters allowed me to compare the paths, distances and topographic surfaces returned by each function in the package to manually calculated values, and in all cases, the results they produced were fully accurate (Appendix S1). These tests are fully reproducible using the code provided in Appendix S1.
3 EXAMPLE APPLICATIONS
3.1 Topographic distances and paths
Calculating topographic distances is most important in areas with high levels of topographic complexity. To demonstrate how topographic distances differ from straight-line distances, I calculated the shortest topographic paths between three known localities for the western fence lizard, Sceloporus occidentalis, a small vertebrate with moderate vagility (Stebbins & McGinnis, 2012), in Yosemite National Park (USA), a landscape known for its dramatic topographic relief. These localities included Lost Lake, Mirror Lake and the base of the Illilouette Gorge at the eastern end of the Yosemite Valley, and I used the topoDist function to calculate topographic distances and paths based on a 1/3 arc-sec (~10 m) DEM raster downloaded from the U.S. Geological Survey (USGS) National 3D Elevation Program (www.usgs.gov/3dep/). The topographic paths connecting these localities, plotted with the topoPathMap function, clearly follow the contours of the landscape (Figure 1), following the edge of the valley floor from the Illilouette Gorge to Mirror Lake and passing through a narrow canyon between Liberty Cap and Mt. Broderick on the way to Lost Lake. Those paths are 14.0% and 14.9% longer than the straight-line distances between those points, calculated in the raster package (Hijmans et al., 2019), and the topographic path between Mirror and Lost Lakes, passing around the steep ridge southwest of Half Dome, is 33.7% longer than the straight-line distance (Figure 1).
3.2 Topographic LCPs
Resistance-based distance analysis, including LCP analysis and circuit theory analysis, frequently disregard any underlying topography, essentially assuming that paths can be estimated without topographically correcting the distances on a resistance surface. However, topography shapes and constrains organismal movement across many landscapes (Dickson et al., 2005; Dong et al., 2016; Murphy, Evans, et al., 2010; Spear et al., 2005). To illustrate the importance of considering topographic relief for LCP analysis, I used the topoLCP function to identify topographic least cost paths (TLCPs) between three additional localities for S. occidentalis in Yosemite National Park (USA): Happy Isles, Tenaya Canyon and the top of Sunrise Creek east of Clouds Rest. For the topographic surface, I used the same DEM. For the resistance surface, I used a habitat suitability model, a common method for parameterizing a resistance surface (Spear et al., 2010; Wang, Yang, Bridgman, & Lin, 2008). The habitat suitability model was constructed with Maxent (Phillips, Anderson, & Schapire, 2006) through the dismo r package (Hijmans, Phillips, & Elith, 2017) using the DEM and 19 bioclimatic data layers, downloaded from the WorldClim Database (www.worldclim.org) and rescaled to match the resolution of the DEM, as predictor variables. For the occurrence points, I downloaded S. occidentalis records from the VertNet Database (www.vertnet.org) for the Yosemite region and spatially rarified them to a set of 85 localities by requiring a 1-km minimum distance between points using the spThin r package (Aiello-Lammens, Boria, Radosavljevic, Vilela, & Anderson, 2019). On the resulting habitat suitability raster, cells with higher suitability are considered less resistant to movement (Wang et al., 2008). To compare with the TLCPs, I also inferred shortest topographic paths using the topoDist function and LCPs (without topographic correction) using the gdistance r package.
The following code shows how simple this analysis is in topoDistance. It starts with defining the xy coordinates for the localities:
-
xy <- matrix(ncol = 2, byrow = TRUE,
-
c(-119.5566, 37.7247,
-
-119.4718, 37.7608,
-
-119.5157, 37.7669))
The topographic LCPs can then be calculated using the topoLCP function and plotted using the topoPathMap function:
-
tlcp <- topoLCP(Yosemite$DEM, Yosemite$SDM, xy, paths = TRUE)
-
topoPathMap(Yosemite$DEM, xy, tlcp, costSurface = Yosemite$SDM,
-
type = "hillshade", pathColor = "purple")
Finally, the resulting topographic LCPs can be compared to the shortest topographic paths by calculating the shortest topographic paths using the topoDist function and plotting them using the lines function.
-
td <- topoDist(Yosemite$DEM, xy, paths = TRUE)
-
lines(td[[2]], lty = 2, lwd = 4, col = "darkred")
The results show that TLCPs can diverge substantially from LCPs and shortest topographic paths (Figure 2). For example, the TLCP, which accounts for topographic distance and landscape resistance, between Happy Isles and Sunrise Creek follows the gently sloping Merced River, whereas the LCP, which ignores topography, follows some rolling and forested terrain roughly 2 km to the north. In this case, the LCP actually passes through some less suitable habitat along the John Muir Trail, compared to moderately less resistant habitat along the Merced River, because the horizontal distance is considerably shorter, suggesting TLCP analysis could uncover more realistic dispersal routes. Differences are also clear when comparing TLCPs and shortest topographic paths – the shortest topographic path between Sunrise Creek and Tenaya Canyon passes through high elevation, very low suitability habitat near Clouds Rest, while the TLCP tracks mid-elevation, open forest habitat to the south (Figure 2).
3.3 Topographic cross sections
Finally, examining the elevation profile of a path can help to characterize the relative positions of populations or segments along the path, describe where a path encounters ecological transition zones, identify areas of terrain with different slopes and compare the vertical trajectories of different paths (Giordano, Ridenhour, & Storfer, 2007; Greco, Fremier, Larsen, & Plant, 2007; Lowe, Likens, McPeek, & Buso, 2006). Using the topoProfile function, I compared topographic cross sections for the shortest topographic path and a weighted topographic path from North Dome to Sunrise Creek. I calculated the shortest topographic path using the topoDist function and a weighted topographic path using the topoWeightedDist function with a linear function to weight angle of aspect changes and an exponential function to weight the slope between cells, because the energetic cost of traversing an incline typically scales exponentially:
-
xy <- matrix(ncol = 2, byrow = TRUE,
-
c(-119.5616, 37.7625,
-
-119.4718, 37.7608))
-
td <- topoDist(Yosemite$DEM, xy, paths = TRUE)
-
twd <- topoWeightedDist(Yosemite$DEM, xy, hFunction = "linear",
-
vFunction = "exponential", paths = TRUE)
I, then, combined the two types of paths and used the topoProfile function to plot their topographic cross sections:
-
topopaths <- rbind(td[[2]][1], twd[[2]][1])
-
topoProfile(Yosemite$DEM, topopaths, type = "plotly",
-
singlePlot = TRUE)
The resulting topographic profiles demonstrate some of the differences between shortest topographic paths and weighted topographic paths (Figure 3). Although the weighted topographic path is longer, it minimizes elevation changes and steep ascents. The shortest topographic path, in contrast, takes a more direct route but rises steeply out of Tenaya Canyon, gaining over 1,400 m in less than 3.5 km as it passes over Clouds Rest before descending back down to Sunrise Creek (Figure 3).
4 CONCLUSIONS
Landscape topography plays a prominent role in shaping movement and dispersal for a wide range of terrestrial organisms (Goljani et al., 2012; Murphy, Evans, et al., 2010; Wang & Shaffer, 2017). Hence, identifying the topographic paths between points and calculating the distances along them provides estimates of functional geographic distances that are potentially valuable for a variety of applications (e.g. Cañedo-Argüelles et al., 2015; Dong et al., 2016; Murphy, Evans, et al., 2010; Spear et al., 2005). The topoDistance package uses a straightforward, optimal algorithm for identifying shortest topographic paths and topographic LCPs that will reliably return the shortest or least costly topographic distances for any landscape. Topographic distances will often vary considerably from straight-line distances, and estimates of topographic distance can change if the resolution of the underlying elevation raster changes. So, consideration should be given to choosing an elevation raster with proper resolution for the study system and whether to incorporate a parameterized resistance surface.
The resolution and size of the elevation raster will affect processing times as well. For a Linux computer with a 3.6 GHz processor, the average time to calculate shortest topographic distances between 10 randomly chosen points was 46 s for 1,000 × 1,000 cell landscapes, 186 s (3.1 min) for 2,000 × 2,000 cell landscapes and 438 s (7.3 min) for 3,000 × 3,000 cell landscapes. Calculations on very large rasters with more than 1 × 107 cells may be memory limited on systems with less than 32 GB of RAM. When mapping the shortest topographic paths in addition to calculating their distances, the average processing time was 125 s (2.1 min) for 1,000 × 1,000 cell landscapes, 473 s (7.9 min) for 2,000 × 2,000 cell landscapes and 1,080 s (18.0 min) for 3,000 × 3,000 cell landscapes. The most time-consuming steps are calculating the topographic distance surface, which is done once for each locality, and mapping the topographic paths, which is done for each pair of localities. So, processing time will increase linearly with the numbers of populations when calculating topographic distances and exponentially when mapping topographic paths. Still, even on a landscape with 1 × 107 cells, topographic distances among 50 localities can be calculated in <15 min without mapping paths or in <6 hr when mapping paths, using a computer with a 3.6 GHz processor.
Overall, the calculation of topographic paths and distances in the topoDistance package is relatively quick and straightforward. By using common r object types, topoDistance output is ready to use for an assortment of downstream applications, including spatial analyses used in landscape ecology, like multiple matrix regression (Wang, 2013) and generalized dissimilarity modelling (Fitzpatrick et al., 2011), and a variety of r packages used in landscape genetics, like BEDASSLE (Bradburd, Ralph, & Coop, 2013), Sunder (Botta, Eriksen, & Guillot, 2015), and PopGenReport (Adamack & Gruber, 2014).
ACKNOWLEDGEMENTS
I thank Katie Everson and Luke Macaulay for their help testing the topoDistance package. This project was supported by the California Agricultural Experiment Station, the USDA National Institute of Food and Agriculture (Hatch project 1007819), and a CAREER grant from the National Science Foundation (DEB-1845682).
Open Research
DATA AVAILABILITY STATEMENT
The topoDistance package is freely available from the CRAN repository (https://CRAN.R-project.org/package=topoDistance) and GitHub (https://github.com/ianjwang/topoDistance). It can be installed using the command install.packages("topoDistance") or install_github("ianjwang/topoDistance") in R. A package manual and vignette for getting started are available from CRAN and the GitHub repository. The DEM used in the example applications was downloaded from the USGS National 3D Elevation Program (https://www.sciencebase.gov/catalog/item/5c89d259e4b09388244f047f). The bioclimatic data layers used to construct the habitat suitability model are available from the WorldClim database (http://biogeo.ucdavis.edu/data/worldclim/v2.0/tif/base/wc2.0_30s_bio.zip), and the occurrence records were downloaded from the VertNet database (http://portal.vertnet.org/search?q=sceloporus+occidentalis).