Volume 10, Issue 11
APPLICATION
Open Access

esdm: A tool for creating and exploring ensembles of predictions from species distribution and abundance models

Samuel M. Woodman

Corresponding Author

E-mail address: sam.woodman@noaa.gov

Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA

Correspondence

Samuel M. Woodman

Email: sam.woodman@noaa.gov

Search for more papers by this author
Karin A. Forney

Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Moss Landing, CA, USA

Moss Landing Marine Laboratories, San Jose State University, Moss Landing, CA, USA

Search for more papers by this author
Elizabeth A. Becker

Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA

Cooperative Institute for Marine Ecosystems and Climate (CIMEC), University of California, Santa Cruz, Santa Cruz, CA, USA

Search for more papers by this author
Monica L. DeAngelis

West Coast Regional Office, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Long Beach, CA, USA

Search for more papers by this author
Elliott L. Hazen

Environmental Resource Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Monterey, CA, USA

Search for more papers by this author
Daniel M. Palacios

Marine Mammal Institute and Department of Fisheries and Wildlife, Hatfield Marine Science Center, Oregon State University, Newport, OR, USA

Search for more papers by this author
Jessica V. Redfern

Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA

Search for more papers by this author
First published: 14 August 2019
Citations: 2

This article has been contributed to by US Government employees and their work is in the public domain in the USA.

Abstract

  1. Species distribution models (SDMs) are a valuable statistical approach for both understanding species distributions and identifying potential impacts of environmental changes or management decisions to species, but multiple SDMs for the same species in a region can create confusion in decision‐making processes.
  2. One solution is to create ensembles (i.e. combinations) of predictions from existing SDMs. However, creating ensembles can be challenging if the predictions were made at different spatial resolutions, using different data sources, or with different prediction value types (e.g. abundance and probability of occurrence).
  3. We present esdm, an r package that allows users to create an ensemble of SDM predictions overlaid onto a single base geometry. These predictions can be evaluated (e.g. through among‐model uncertainty or AUC, TSS and RMSE metrics), mapped, and exported. esdm includes a built‐in GUI created using the r package shiny, which makes the package accessible to non‐r users.
  4. We provide an overview of esdm functionality and use esdm to create an ensemble of predictions from three blue whale Balaenoptera musculus SDMs for the California Current Ecosystem.

1 INTRODUCTION

Species distribution models (SDMs; i.e. habitat‐based occurrence models or ecological niche models) characterize the relationship between spatially and temporally explicit species observations and environmental data. SDMs are widely used to predict species distribution and abundance based on habitat covariates, and these predictions can be used to make conservation and management decisions (Elith & Leathwick, 2009; Gregr, Baumgartner, Laidre, & Palacios, 2013). The increased use of SDMs worldwide (Guisan et al., 2013) has created new challenges when multiple SDMs for the same species in a single region produce conflicting results (Araújo & New, 2007; Jones‐Farrand et al., 2011). Individual SDMs may identify unique ecological niches or suggest different management actions because of the strengths, biases, and limitations of each underlying dataset and model algorithm (Jones‐Farrand et al., 2011). These issues are often difficult to reconcile and incorporate into management decision‐making.

An ensemble (i.e. a weighted or unweighted average or combination) provides an established method for resolving differences between individual models and estimating uncertainty (Araújo & New, 2007; Marmion, Parviainen, Luoto, Heikkinen, & Thuiller, 2009). For example, model ensembles have been widely used in global climate change assessments to evaluate mean predictions and associated uncertainties (Annan & Hargreaves, 2010; Tebaldi & Knutti, 2007). In addition, ensembles have been successfully used to model species distributions (e.g. Forney, Becker, Foley, Barlow, & Oleson, 2015; Grenouillet, Buisson, Casajus, & Lek, 2011; Oppel et al., 2012; Pikesley et al., 2013; Scales et al., 2016), although these studies each relied upon a single data source. The authors created ensembles by averaging corresponding predictions from SDMs generated using different model algorithms and the original species and environmental data. Several existing software tools implement this method, including r packages (R Core Team, 2019) biomod2 (Thuiller, Georges, Engler, & Breiner, 2019) and sdm (Naimi & Araújo, 2016).

A different approach is needed when multiple data sources exist. Integrated analyses, such as a Bayesian hierarchical framework, can be used to obtain a single, probabilistic assessment of species distributions from several original data sources (e.g. Golding & Purse, 2016; Hefley & Hooten, 2016). However, this approach is not always practical for general use because it requires extensive statistical expertise and is generally time‐consuming and computationally challenging. Simpler methods for combining information from multiple data sources exist (e.g. Merow, Wilson, & Jetz, 2017; Pacifici et al., 2017), but still require the original data sources. If original data are unavailable, SDM predictions derived from these original data may be the only accessible information for a particular region. Combining or reconciling these predictions can be difficult, particularly if they were created using different methods or at different spatial resolutions (but see Sansom, Wilson, Caldow, & Bolton, 2018 for methods comparing prediction maps from different sources).

For example, multiple predictions from blue whale Balaenoptera musculus SDMs for the California Current Ecosystem (CCE) have been published (Becker et al., 2016; Hazen et al., 2017; Redfern et al., 2017), although some of the underlying datasets are not publicly available. These predictions were created at several spatial resolutions, in various coordinate systems, and using different data sources, habitat covariates, and modelling frameworks. In addition, the SDMs predicted absolute density, habitat preference, and relative density (e.g. density calculated without line transect correction factors; see Redfern et al., 2017), respectively (see Table 1 for model details).

Table 1. A summary of the individual SDMs that predicted the blue whale distributions used in the example analysis
Model citation Becker et al. (2016) Hazen et al. (2017) Redfern et al. (2017)
Abbreviation Model_B Model_H Model_R
Whale data source 1991–2009 shipboard line‐transect surveys 1994–2008 satellite telemetry data 1991–2009 shipboard line‐transect surveys
Modelling framework Generalized additive model (GAM) Generalized additive mixed model (GAMM) Generalized additive model (GAM)
Spatial resolution 0.09° × 0.09° (~10 × 10 km2) grid 0.25° × 0.25° (~25 × 25 km2) grid 10 × 10 km2 equal‐area grid
Prediction value type Absolute density Habitat preference Relative density

We present esdm (Ensemble tool for predictions from Species Distribution Models), an r package with a built‐in graphical user interface (GUI) for creating ensembles of SDM predictions. esdm allows users to overlay SDM predictions onto a single base geometry, create ensembles of these overlaid predictions, and evaluate, map, and export predictions. It also provides several options for incorporating or calculating uncertainty. The information provided by this tool can assist users in identifying spatial uncertainties and making informed conservation and management decisions. esdm (v0.3.0; https://doi.org/10.5281/zenodo.3371754) is available on CRAN, and the GUI can be run locally or accessed online. esdm uses the r package sf (Pebesma, 2018) for fast processing of spatial data, while the GUI, created using the r package shiny (Chang, Cheng, Allaire, Xie, & McPherson, 2019), makes the tool accessible to non‐r users. In this paper, we provide an overview of esdm functionality and use the GUI to create and evaluate ensembles of predictions from the three blue whale SDMs (Table 1).

2 esdm OVERVIEW

Creating ensemble predictions using esdm requires three major steps: (a) importing original SDM predictions, (b) overlaying the original predictions onto a single base geometry, and (c) creating ensemble predictions via a weighted or unweighted average of rescaled overlaid predictions. Additional steps may include evaluating, mapping, or exporting predictions. Validation data can be read from comma‐separated value (CSV) and GIS files (shapefiles and file geodatabase feature classes) as either binary (i.e. species presence/absence) or count data. esdm allows users to calculate several common evaluation metrics: area under the receiver operating characteristic curve (AUC; Fielding & Bell, 1997), true skill statistic (TSS; Allouche, Tsoar, & Kadmon, 2006), and root‐mean‐square error (RMSE). AUC and TSS measure the discriminatory ability of an SDM, and can be calculated with predictions of any value type. RMSE, a scale‐dependent measure that requires count validation data, evaluates both the discriminatory ability and calibration of an SDM. Users can export ensemble predictions to calculate other metrics. Uncertainty associated with predictions (e.g. standard error values) can be imported, mapped, and used to weight predictions in an ensemble or calculate uncertainty values for the ensemble predictions. Ensemble uncertainty can also be assessed using the among‐model variance. In addition, the GUI allows users to create maps of predictions and additional objects, such as validation data or areas of human use (e.g. shipping lanes).

The esdm GUI provides esdm functionality through a user‐friendly, web‐based interface. Alternatively, users familiar with r can incorporate esdm functions in their own code (see Table 2 for function descriptions). Here we present a flowchart of the GUI workflow (Figure 1) and describe the major steps of creating ensemble predictions.

Table 2. Brief descriptions of esdm functions; see package documentation for more details
Function Description
esdm_gui Launch the esdm GUI
ensemble_create Create a weighted or unweighted ensemble of SDM predictions, and calculate associated uncertainty values
ensemble_rescale Rescale predictions using the abundance or sum to one method
evaluation_metrics Calculate AUC, TSS, or RMSE of SDM predictions using validation data
model_abundance Calculate the predicted abundance using SDM density predictions and the area of the corresponding prediction polygons
overlay_sdm Overlay SDM predictions onto a base geometry
pts2poly_vertices Create polygon(s) from a data frame containing the longitude and latitude coordinates of the polygon vertices
pts2poly_centroids Create polygons from a data frame containing the longitude and latitude coordinates of a regular grid of polygon centroids
image
Flowchart detailing the workflow of the esdm GUI, i.e. the order in which users can access and use sections of the GUI. Grey ovals represent tabs within the GUI, orange squares represent files imported by users, and green arrow boxes represent files exported by users. Users can also load a saved GUI workspace rather than re‐importing and processing predictions during each session. In the ‘Create Ensemble Predictions’ tab, the user can use ensemble weights based on user inputs or metrics calculated in the ‘Evaluation Metrics’ tab. Additional details are provided in the text and the esdm GUI manual

2.1 Importing predictions

The esdm GUI accepts SDM predictions in several common formats (Figure 1) and processes them to create a ‘prediction polygon’ for each individual prediction value. These prediction polygons make up the ‘geometry’ of a set of predictions, similar to how individual cells make up a raster. When importing predictions from a CSV file, the provided coordinates must be WGS 84 geographic coordinates (i.e. decimal degrees) and represent the centroids of a regular grid of prediction polygons. The GUI can also read and process predictions from GIS files (rasters, shapefiles, and file geodatabase feature classes), which have already‐defined geometries and coordinate systems. Those writing their own r code can use esdm function pts2poly_centroids to convert centroid coordinates to prediction polygons and functions from the raster (Hijmans, 2019) and sf packages to import GIS files.

The GUI accepts ‘Abundance’, ‘Absolute density’, or ‘Relative density’ as prediction value types. Users should select ‘Relative density’ for value types that are proportional to density but do not represent an absolute abundance or density (e.g. probability of occurrence or habitat preference; see Aarts, Fieber, & Matthiopoulos, 2012). The GUI allows the user to rescale these values if needed (described in Section 2.3 below).

2.2 Overlaying predictions

The overlay function, overlay_sdm, is the backbone of esdm. It overlays SDM predictions onto a single base geometry, transforming all predictions to the same spatial resolution and coordinate system (Figure 2). Within the GUI, users can choose which of the imported predictions to use as the base geometry and specify the coordinate system in which the overlay will be performed. They can also import polygons to clip or erase portions of the base geometry, such as to specify a study area or erase land from marine predictions.

image
Schematic illustration: (a) The base geometry, with the blue outline indicating the current base geometry polygon. (b) The geometry of the SDM predictions being overlaid. (c) The SDM predictions overlaid onto the base geometry. (d) Same as (c), with the intersection between the overlaid polygons and the current base geometry polygon (i.e. intersected polygons) coloured yellow

The overlay function intersects the prediction polygons from an original SDM with the prediction polygons from the user‐selected base geometry (i.e. base geometry polygons). It then calculates the percentage of each base geometry polygon that overlaps with these intersected polygons, ignoring intersected polygons that have missing (i.e. ‘NA’) prediction values. If this percentage meets or exceeds the user‐specified percent overlap threshold, the function calculates the overlaid prediction as an area‐weighted average of the predictions of the intersected polygons (i.e. areal interpolation; Goodchild & Lam, 1980). Otherwise, the function assigns that base geometry polygon an overlaid prediction of ‘NA’, thereby excluding it from any ensembles. Associated uncertainty values and weights are also overlaid using an area‐weighted average.

2.3 Creating ensemble predictions

2.3.1 Rescaling different prediction value types

Overlaid predictions that have different prediction value types (e.g. absolute density vs. probability of occurrence), should be rescaled to ensure predictions do not contribute disproportionately to the ensemble. Users can either rescale predictions to a specified total abundance within the study area or, if they do not have an abundance estimate, rescale predictions to sum to one. These rescaling methods are inherently similar and result in ensembles with similar distribution patterns. However, only the abundance rescaling method results in an ensemble with a meaningful abundance estimate. If another rescaling method is desired, users can rescale predictions before importing them, or export and rescale overlaid predictions.

2.3.2 Ensemble method

Ensembles can be created using a weighted or unweighted average of the rescaled predictions. Weights can be based on evaluation metrics (i.e. evaluation metric values, rescaled to sum to one, of the overlaid predictions), the inverse of the variance of the overlaid predictions, or assigned by users either for the entire study area or for each prediction polygon. Users can also regionally exclude predictions from the ensemble if they have some a priori reason to do so (e.g. known biases in a specific region). esdm calculates uncertainty for the ensemble predictions using either the user‐specified prediction uncertainties or the among‐model variance.

3 EXAMPLE ANALYSIS

Predictions from cetacean SDMs can be used to assess the risk of entanglements and ship‐strikes (e.g. Redfern et al., 2013), which represent the largest sources of anthropogenic injury or mortality for blue whales in the CCE (Carretta et al., 2018). Becker et al. (2016), Hazen et al. (2017), and Redfern et al. (2017) developed models of blue whale distributions in this region (henceforth Model_B, Model_H, and Model_R, respectively) that can provide information for risk assessments. However, the predictions from these models differ in some areas (Figure 3), making them challenging to use for management purposes. We use the esdm GUI to perform an example analysis that explores differences between the blue whale SDM predictions and creates an ensemble of the predictions, with associated uncertainty.

image
Maps of the original predictions and associated uncertainty for Model_B (Becker et al., 2016), Model_H (Hazen et al., 2017) and Model_R (Redfern et al., 2017). In the top row, predictions are colour‐coded using the numerical prediction value for each SDM. In the middle row, the original standard error values have the same colour‐coding as the top row. In the bottom row, the original predictions are colour‐coded using relative percentages (i.e. percentiles). In all maps, the red line is the California Current Ecosystem study area boundary while the tan area represents the erasing polygon (i.e. the U.S. West Coast). For the top and middle rows, the units are whales per km2, (left and right panels) and habitat preference (centre panels)

The three blue whale models differ in multiple ways (Table 1). Model_B predicted absolute whale densities using line‐transect survey data and 8‐day composites of predictor variables in a generalized additive model (GAM) framework. The predictions were made at a 0.09° (approximately 10 km) spatial resolution for August–November. Model_H used whale presences and pseudoabsences, derived from telemetry data, in a generalized additive mixed model (GAMM) framework to predict monthly whale probability of occurrence at a 0.25° (approximately 25 km) spatial resolution. Hazen et al. (2017) scaled these predictions by an independent abundance estimate; we follow the terminology used in Hazen et al. (2017) and refer to the scaled predictions as habitat preference. We averaged the Model_H predictions from August to November to match the other predictions. Model_R predicted relative whale densities using line‐transect survey data and predictor variables, averaged from late July to early December, in a GAM framework at a 10‐km spatial resolution.

We followed the methods of Becker et al. (2016) and used the mean of the summer/fall blue whale predictions for 2001, 2005 and 2008 (the years with both line transect surveys and satellite tracks) from each model in the example analysis. Interannual variability has been shown to be the greatest source of uncertainty for cetacean SDMs in this region (Becker et al., 2016), and thus we calculated standard errors (SEs) using the three yearly predictions from each model. We imported these mean predictions into the GUI and created maps to compare prediction values, uncertainty, and distribution patterns (Figure 3). All three models predicted high blue whale densities in the Southern California Bight and along the central California coast. However, the Model_H predictions also had high values north of 40°N, where shipboard survey sightings and telemetry records of blue whales have been infrequent (Barlow & Forney, 2007; Becker et al., 2018; Irvine et al., 2014).

To create overlaid predictions, we imported a study area polygon that spanned the CCE and loaded the GUI‐provided land polygon as the erasing polygon. We selected the equal area geometry of the Model_R predictions as the base geometry because polygon intersection and area calculations are most accurate in appropriate equal area coordinate systems. We also specified an overlap threshold of 50 percent, meaning if less than half of a base geometry polygon intersected with original predictions, the polygon was excluded from the ensemble. The different prediction value types of the overlaid SDMs were rescaled using an abundance estimate of 1648 blue whales, which was the mean of the Model_B predicted study area abundance for 2001, 2005, and 2008. For the Model_H predictions, this rescaling follows the method used in Hazen et al. (2017) to relate the predictions from the GAMMs to an independent study area abundance.

We evaluated SDM performance by calculating AUC and TSS using several binary validation datasets: (a) species presence/absence points derived from survey transects (Becker et al., 2016; 71 presence and 7,368 absence points), (b) home ranges (90% isopleths) derived from 171 satellite‐tagged blue whales (Irvine et al., 2014; 328 presence and 10,386 absence points), and (c) a combination of these two datasets. These validation data are not independent data, as the survey transects were used in Model_B and Model_R and the satellite telemetry data were used in Model_H. However, we are not aware of any independent validation datasets for blue whales that span the CCE. Combining these data resulted in validation data with at least some novel presence and absence points for all predictions.

The home ranges represent areas of high use for blue whales, as identified by a long‐term satellite tracking dataset (1994–2008; Irvine et al., 2014). To translate the home ranges into binary validation data, we assumed that greater home range overlap indicates a higher likelihood of whale presence. The home ranges for all whales spanned most of the CCE, making it unrealistic for individual home ranges to indicate presence. Consequently, we used cut‐off values for the number of overlapping home ranges to define presence and absence points. We performed a sensitivity analysis to identify cut‐off values that maximized the AUC values of the overlaid SDM predictions. We defined the centroid of each base geometry polygon as a presence if it intersected with the home ranges of at least twenty whales, and an absence for the home ranges of nine or fewer whales. Points that intersected with ten to nineteen home ranges were not included in the validation data.

We calculated evaluation metrics using all three validation datasets to determine whether different predictions performed better with different validation datasets (i.e. the line transect or satellite telemetry data). The AUC and TSS values for the original and overlaid predictions were similar across validation datasets (Table 3), confirming that the overlay conserved the predicted distributions. Model_B and Model_R predictions had higher AUC and TSS values than Model_H predictions for all validation datasets (Table 3). However, the metrics indicated fair performance for the Model_H predictions and these predictions were included in all ensembles.

Table 3. Evaluation metrics, area under the receiver operating characteristic curve (AUC) and true skill statistic (TSS), for all example analysis predictions. The first two columns (‘AUC’ and ‘TSS’) contain metrics calculated using the combined validation dataset. Columns ‘AUC‐LT’ and ‘TSS‐LT’ contain metrics calculated using only the line transect validation dataset, while columns ‘AUC‐HR’ and ‘TSS‐HR’ contain metrics calculated using only the home range validation dataset. See the example analysis section for additional details
Predictions AUC TSS AUC‐LT TSS‐LT AUC‐HR TSS‐HR
Becker et al. (2016) original 0.912 0.717 0.732 0.374 0.963 0.824
Hazen et al. (2017) original 0.734 0.414 0.620 0.284 0.772 0.471
Redfern et al. (2017) original 0.919 0.756 0.684 0.290 0.980 0.882
Becker et al. (2016) overlaid 0.916 0.742 0.732 0.380 0.967 0.856
Hazen et al. (2017) overlaid 0.735 0.406 0.620 0.286 0.772 0.460
Redfern et al. (2017) overlaid 0.919 0.756 0.684 0.290 0.980 0.882
Ensemble – unweighted 0.915 0.772 0.699 0.345 0.972 0.888
Ensemble – AUC‐based weights 0.917 0.777 0.703 0.349 0.973 0.893
Ensemble – TSS‐based weights 0.920 0.785 0.708 0.352 0.975 0.900
Ensemble – variance‐based weights 0.888 0.670 0.713 0.344 0.936 0.764

We created ensembles using several weighting methods: equal weights (i.e. unweighted), AUC‐based weights (as in Oppel et al., 2012), TSS‐based weights (as in Scales et al., 2016), and weights calculated as the inverse of the prediction variance. The among‐model uncertainty of the unweighted ensemble allowed us to examine spatial agreement between the predictions (Figure 4). We found relative agreement between the overlaid predictions south of 40°N, particularly in areas of high prediction values along the California coast and in the Southern California Bight. However, the ensemble uncertainty values were greater north of 40°N where only the Model_H predictions were high, suggesting that the northern ensemble predictions should be used with caution. The ensemble created using TSS‐based weights had the highest evaluation metrics of the ensemble predictions, and mostly higher AUC and TSS scores than the original predictions (Table 3). Its distribution patterns also visually matched known blue whale habitat (Calambokidis et al., 2015; Figure 5), and thus we considered it the ‘best’ ensemble for this example analysis.

image
Maps of an unweighted ensemble of the overlaid predictions, the among‐model standard error (SE) of the ensemble predictions, and the associated coefficient of variation (CV). The tan area represents the U.S. West Coast. The units are whales per km2
image
Maps of the ensemble created with weights based on TSS and associated uncertainty. In the top row, the prediction and standard error (SE) values are both colour‐coded using the same numerical scale (whales per km2). In the bottom row, the predictions are colour‐coded using relative percentages (i.e. percentiles) and the right‐most map includes the presence points from the combined validation dataset as black dots. The tan area represents the U.S. West Coast

4 DISCUSSION

Using the esdm GUI, we successfully created an ensemble of mean blue whale predictions from Becker et al. (2016), Hazen et al. (2017), and Redfern et al. (2017) despite their different spatial resolutions, data sources, and prediction value types. The best ensemble predictions identified known blue whale habitat in the CCE, while generally improving evaluation metrics and minimizing biases associated with any single SDM. Researchers are frequently updating and improving SDMs (e.g. new blue whale models have been published by Becker et al., 2018, Abrahms et al., 2019, and Palacios et al., 2019 since we undertook our example analysis). Consequently, we do not intend our results to be considered the current best set of predictions for blue whales in the CCE. Instead, we present esdm as a tool for creating and evaluating ensembles of SDM predictions for any species in a timely, straightforward and robust manner. This tool can allow managers and practitioners to avoid potentially ambiguous choices between models, and instead make more informed, science‐based decisions.

The example analysis demonstrates the utility of esdm and provides a framework and guidelines for esdm users. These guidelines are important because ensemble predictions are not inherently better than the original predictions; ensemble quality is dependent on sensible inputs and informed user choices when creating the ensemble (Araújo & New, 2007). For example, ensembles can minimize the biases of individual SDMs by averaging predictions across SDMs. However, creating an ensemble of predictions with similar biases will result in a biased ensemble, and thus an ensemble should incorporate predictions from SDMs that rely on different methods and data sources. In addition, esdm provides several ensemble methods because there is no consensus best method (Araújo & New, 2007; see Dormann et al., 2018 for an in‐depth discussion of weighting schemes). An unweighted average is useful when determining reasonable weights is impractical, such as in a data‐poor region. A weighted average allows users that know biases a priori, e.g. through evaluation metrics or expert knowledge, to specify the contribution of each set of predictions to the ensemble.

When used properly, ensembles reduce implicit uncertainty (e.g. model type or data source) by averaging predictions made using different model types or data sources (Jones‐Farrand et al., 2011). However, esdm also offers several ways to incorporate explicit uncertainty (e.g. the standard error of model predictions) when creating an ensemble. For instance, ensemble weights based on original prediction uncertainty reduce the contribution of predictions with high uncertainty to an ensemble. However, this feature should only be used with comparable uncertainty values; if a model underestimates uncertainty, then its predictions will contribute disproportionality to an ensemble. In addition, esdm users can estimate among‐model uncertainty to identify areas of spatial agreement and disagreement between the predictions, which can indicate regions of an ensemble with higher or lower levels of precision.

Conservation and management decisions often have short timelines, making it difficult to conduct new studies. esdm allows decision‐makers to quickly create an ensemble of SDM predictions using simple methods (Gregr, Palacios, Thompson, & Chan, 2019; Ward, Holmes, Thorson, & Collen, 2014). To create a meaningful ensemble, users must choose sensible original predictions and an appropriate ensemble method. For less obvious decisions, such as choosing a base geometry or deciding between AUC‐based and TSS‐based weights, esdm provides a user‐friendly tool for examining the sensitivity of an ensemble to user decisions. While it is important that all choices be realistic and ecologically sound, these sensitivity analyses enable users to better understand the underlying uncertainties in species distribution patterns and allow for informed decision‐making.

ACKNOWLEDGEMENTS

This work arose from the workshop ‘Towards Ensemble Averaging of Cetacean Distribution Models’ organized by the National Marine Fisheries Service, with support from the International Whaling Commission, in San Diego 21 May 2015. The authors acknowledge the workshop sponsors and attendees. Funding for this project was provided by the NOAA Fisheries Office of Science and Technology as part of the National Protected Species Toolbox initiative. We thank the spatial toolbox steering group for their feedback on the tool. This manuscript was improved by the insightful reviews of Matthieu Authier, Paul Fiedler, the Associate Editor and two anonymous reviewers.

    AUTHORS’ CONTRIBUTIONS

    K.A.F., E.A.B., M.L.D., E.L.H., D.M.P. and J.V.R. conceived the project. S.M.W. wrote the esdm r package and led the writing of the manuscript with help from K.A.F. and J.V.R. E.A.B., E.L.H., J.V.R. and D.M.P. provided data used in the example analysis. All co‐authors provided feedback on both the esdm package and manuscript and gave approval for publication.

    DATA AVAILABILITY STATEMENT

    Instructions for installing esdm and accessing the GUI, along with code for creating applicable figures, are at https://github.com/smwoodman/eSDM. esdm (https://doi.org/10.5281/zenodo.3371754) contains the example analysis data (https://doi.org/10.5281/zenodo.3365744) and a vignette performing the example analysis in r.

      Number of times cited according to CrossRef: 2

      • Performance evaluation of cetacean species distribution models developed using generalized additive models and boosted regression trees, Ecology and Evolution, 10.1002/ece3.6316, 10, 12, (5759-5784), (2020).
      • Using fisheries data to model the oceanic habitats of juvenile silky shark (Carcharhinus falciformis) in the tropical eastern Atlantic Ocean, Biodiversity and Conservation, 10.1007/s10531-020-01979-7, (2020).