Volume 14, Issue 3 p. 757-770
PERSPECTIVE
Open Access

Regional ecological forecasting across scales: A manifesto for a biodiversity hotspot

Jasper A. Slingsby

Corresponding Author

Jasper A. Slingsby

Department of Biological Sciences and Centre for Statistics in Ecology, Environment and Conservation, University of Cape Town, Cape Town, South Africa

Fynbos Node, South African Environmental Observation Network, Centre for Biodiversity Conservation, Cape Town, South Africa

Correspondence

Jasper A. Slingsby

Email: [email protected]

Search for more papers by this author
Adam M. Wilson

Adam M. Wilson

Department of Geography, Department of Environment and Sustainability, University at Buffalo, Buffalo, New York, USA

Search for more papers by this author
Brian Maitner

Brian Maitner

Department of Geography, Department of Environment and Sustainability, University at Buffalo, Buffalo, New York, USA

Search for more papers by this author
Glenn R. Moncrieff

Glenn R. Moncrieff

Fynbos Node, South African Environmental Observation Network, Centre for Biodiversity Conservation, Cape Town, South Africa

Centre for Statistics in Ecology, Environment and Conservation, Department of Statistical Sciences, University of Cape Town, Cape Town, South Africa

Search for more papers by this author
First published: 03 January 2023
Citations: 3
Handling Editor Sydne Record

Abstract

en

  1. Iterative near-term ecological forecasting has great promise to provide vital information to decision-makers while improving our ecological understanding, yet several logistical and fundamental challenges remain. The ecoinformatics requirements are onerous to develop and maintain, posing a barrier to entry for regions where funding and expertise are limited, and there are fundamental challenges to developing forecasts that fulfil information needs spanning spatial, temporal and biological scales.
  2. Using the hyperdiverse Cape Floristic Region of South Africa as a case study, we propose that developing regionally focussed sets of ecological forecasts will help resolve logistical challenges faced by under-resourced regions of the world, while comparison or coupling of models across scales will facilitate new fundamental insights. We review information needs and existing models for the region and explore how they could be developed into a set of linked iterative near-term forecasts.
  3. Comparing or coupling ecological forecasts from different scales within the same domain has much potential to provide new insights for decision-makers and ecologists alike. They allow us to quantitatively link processes in space and time, potentially revealing feedbacks, interconnections and emergent properties, while providing powerful tools for testing decision scenarios and identifying trade-offs or unanticipated outcomes. While the development of multiple or combined ecological forecasts that span scales is not trivial, there are logistical gains to be made from developing shared ecoinformatics pipelines that feed multiple models. Even where useful forecasts do not yet exist, the pipelines can be of great value in their own right, delivering frequent and up-to-date information to decision-makers while providing the basis for forecast development and other scientific research.
  4. Viewed together, regionally focussed approaches to ecological forecasting present a compelling opportunity to overcome logistical constraints and to integrate across multiple scales of organisation, ultimately improving our understanding and management of ecosystems.

Isishwankathelo

sw

  1. Uqikelelo oluphindaphindayo lwexesha elikufuphi ngendalo lunesithembiso esikhulu sokubonelela ngolwazi olubalulekileyo kubathathi-zigqibo ngelixa liphucula ukuqonda kwethu indalo esingqongileyo, ukanti imingeni emininzi yolungiselelo kunye nesisiseko isekho. Inzululwazi ye-ecoinformatics ineemfuneko ezinzima ukuphuhlisa nokugcina, ezi mfuneko zibeka umqobo ekungeneni kwimimandla apho inkxaso-mali kunye nobuchule bunqongopheleyo, kwaye kukho imingeni engundoqo ekuphuhliseni uqikelelo oluzalisekisa iimfuno zolwazi oluquka izikali zendawo, ezexesha kunye nezendalo.
  2. Sisebenzisa Ummandla waseKapa eMzantsi Afrika kwihlabathi liphela unentlobo ngeentlobo ezininzi zezinto eziphilayo njengomzekelo wophononongo, sicebisa ukuba ukuphuhlisa iiseti zengqikelelo zendalo ezigxile kwingingqi kuya kunceda ukusombulula imiceli mngeni yolungiselelo olujongene nemimandla yelizwe engenazibonelelo, ngelixa uthelekiso okanye ukudityaniswa kwemifuziselo kwizikali ezahlukeneyo iya kuququzelisa ukuqonda okutsha okungundoqo. Sihlola iimfuno zolwazi kunye nemifuziselo ekhoyo kule ngingqi yaseKapa kwaye siphonononga ukuba zingaphuhliswa njani zibe luqikelelo oludityanisiweyo lwexesha elikufuphi.
  3. Ukuthelekisa okanye ukudibanisa iingqikelelo zendalo esingqongileyo ezivela kwizikali ezahlukeneyo ngaphakathi kwendawo enye kunesakhono esikhulu sokunika ulwazi olutsha kubathathi-zigqibo kunye nee-ngcali zendalo esingqongileyo ngokufanayo. Zisivumela ukuba sidibanise ngokwezibalo iinkqubo kwisithuba kunye nexesha, ezinokutyhila iimpendulo, ukudibanisa, kunye nezinto ezintsha ezivelayo, ngelixa zisinika izixhobo ezinamandla zokuvavanya iimeko yezigqibo kunye nokuchonga kokubuyisana okanye iziphumo ebezingalindelekanga. Ngelixa uphuhliso loqikelelo lwendalo oluphindaphindeneyo okanye oludityanisiweyo lwezikali ezininzi lungabalulekanga kangako, kukho iinzuzo zolungiselelo ekufuneka zenziwe ngokuphuhlisa imifuziselo kwiNzululwazi ye-ecoinformatics ekwabelwanayo ngayo eyongeza kwi imifuziselo emininzi. Kwanalapho uqikelelo oluluncedo lungekabikho, olu lwazi lunokuba nexabiso elikhulu ngokwalo, ngokuzisa rhoqo ulwazi oluhlaziyiweyo kubenzi bezigqibo ngelixa libonelela ngesiseko sophuhliso lwengqikelelo kunye nolunye uphando lwezenzululwazi.
  4. Xa zijongwa kunye, iindlela ezijolise kwingingqi kuqikelelo lwendalo zinika ithuba elinomtsalane lokoyisa imiqobo yolungiselelo kunye nokudibanisa zonke izikali zezinto eziphilayo, ekugqibeleni siphucula ukuqonda kwethu kunye nolawulo lwendalo esingqongileyo.

1 INTRODUCTION

To mitigate and adapt to global change, planners and managers need information on the current and potential future states of biodiversity and dominant change drivers, including the confidence they can have in that information. We are altering our environment at an unprecedented rate, with dire consequences for nature's contribution to people (Díaz et al., 2018, 2019), yet even our basic understanding of ecology has limitations. Developing iterative near-term ecological forecasts has great potential to provide vital information to decision-makers while also improving our ecological understanding (Clark et al., 2001; Dietze, 2017; Dietze et al., 2018). Near-term ecological forecasting involves specifying models representing our current knowledge to make quantitative predictions of variables of interest over daily to interannual timescales, including fully specified uncertainties (Clark et al., 2001). Ideally, this is done iteratively, collecting and fusing new observations with the models, thus formalising the learning feedback loop by sequentially testing and improving our model, forecasts and ecological understanding (Dietze et al., 2018). While the information content of forecasts may initially be low (i.e. high uncertainty), this should improve with each iteration. Just developing the ecoinformatics pipeline required to collect and feed new observations to the model is a worthy pursuit in itself, because it can facilitate rapid delivery of relevant data to researchers and decision-makers (MacFadyen et al., 2022). Better still would be if the pipeline and potential forecasts delivered on multiple information needs for a region of interest. A multi-pronged, regionally focussed approach to ecological forecasting, with explicit comparison or coupling of models across scales, would create new learning opportunities, while overcoming logistical constraints and providing novel information to inform decision-making.

While the ecological forecasting paradigm has gained much traction and shows great potential for facilitating rapid advances (Dietze & Lynch, 2019; https://ecoforecast.org/), several challenges remain. Some are logistical, such as the difficulty and expense of collecting sufficient new data and making it available at low enough latency to allow for a useful forecasting time-step or forecast horizon, or developing and maintaining the ecoinformatics pipeline required to build and update models (Fer et al., 2021; MacFadyen et al., 2022). These challenges can be a barrier to developing ecological forecasts in under-resourced environments, as is evident from the global membership of the Ecological Forecasting Initiative (https://efi-members.herokuapp.com/). Other challenges are more fundamental, such as spanning spatial and temporal scales or levels of biological organisation (individual, population, community, landscape, etc.), and the potential for contrasting forecasts from models specified at different scales (e.g. the ecological fallacy; Robinson, 1950). Here we explore the potential utility of developing and linking models from different scales and disciplines within a focal region to help overcome logistical constraints, build interdisciplinary collaboration and to improve our ecological learning. We do this using the hyperdiverse Cape Floristic Region (CFR) of South Africa as a case study, where more than four decades of close engagement among scientists and managers have clearly defined many information needs (Allsopp et al., 2019; Gelderblom & Wood, 2019) and there is a rich history of ecological modelling, but few existing ecological forecasts. As such, this contribution also serves as a manifesto to encourage and guide the development of an integrated regional ecological forecasting system (Figure 1) that can support a set of interlinked near-term iterative ecological forecasts to allow us to gain a much deeper understanding of the exceptional biodiversity and ecology of this globally important region.

Details are in the caption following the image
A conceptual overview of an integrated regional ecological forecasting system. Threats to biodiversity and nature's contributions to people and the information needs of decision-makers are identified. Monitoring and observation efforts collect data and information on threats, drivers and state variables (arrows with solid lines) and feed them to forecasts via an ecoinformatics pipeline that integrates and processes them. Forecasts and relevant raw data are made available for decision support via dashboards or similar that can be interrogated by managers and policymakers. Decision-makers identify their information needs concerning key threats. They also help define the scenarios or other inputs required by the forecast, and the way in which information is shared with them (e.g. a dashboard). Double-headed arrows indicate feedback to improve the collection, curation, analysis or presentation of data and information. The impacts of policy or management decisions are represented by the dot-dash line. Stippled lines highlight interlinkages among ecological forecasting models. Note that the threats, monitoring tools and forecasts are merely illustrative and not exhaustive.

The CFR is a Global Biodiversity Hotspot (Myers et al., 2000). Sadly, it is also a Global Extinction Hotspot (Humphreys et al., 2019), harbouring ~13% of all threatened plant species on the IUCN Red List of Threatened Species (https://www.iucnredlist.org/) due to habitat loss and fragmentation (Moncrieff, 2021; Ntshanga et al., 2021; Skowno et al., 2021), invasive species (Van Wilgen & Wilson, 2018), altered fire regimes (Slingsby, Moncrieff, Rogers, et al., 2020) and climate change (Slingsby et al., 2017), among others. These threats tie to several interlinked national goals and legislation for maintaining biodiversity and nature's contributions to people (summarised in Table S1), each with its own information needs, namely reducing habitat loss or degradation in areas of high biodiversity value (requiring landscape level monitoring and forecasting of ecosystem health or vegetation state); managing fire in a manner that maintains species and ecosystems while minimising risk of damage to property (requiring forecasts of the ecological and economic trade-offs of different fire management options); sustainably maximising water yield and carbon storage while minimising fire risk (requiring landscape-level forecasts of biomass and water yield); allowing sustainable harvesting of wild plant populations (requiring forecasts of the consequences of different harvesting regimes on populations over and above other global change threats); and controlling alien woody plant invasions so as to reduce their impact on streamflow, wildfire and biodiversity (requiring forecasts of the rate of growth and spread of invasive species populations). Most of these information needs can be directly tied to one or more existing ecological modelling pursuits and have the potential to be developed into ecological forecasts.

In the next sections, we introduce the existing models in the context of the known policy and management information needs listed above. We then explore their feasibility as ecological forecasting endeavours and the potential to build them and other models into an integrated regional ecological forecasting system (Figure 1). Key to this endeavour is the development of a shared and efficient ecoinformatics pipeline that can ingest monitoring data and either feed them directly to decision-makers in their raw form or fuse them with the models to iteratively update forecasts. Developing this pipeline as a shared community resource should have several advantages, such as reduced or shared costs, closer interaction across disciplines, identification of shared data (or forecast) needs and gaps, and easier development of new models and comparisons or interlinkages between them.

2 INFORMATION NEEDS AND EXISTING MODELS

2.1 Landscape-level monitoring and change detection

Perhaps the most straightforward information requirement for policymakers and managers is knowledge of the remaining extent, condition and rate of loss of natural ecosystems. This has become a common requirement globally, being used to assess the threat of ecosystem collapse through the IUCN Red List of Ecosystems (Keith et al., 2015) and Goal A of the emerging Global Biodiversity Framework of the Convention on Biological Diversity. In South Africa, legislation has required the use of ecosystem threat status as a tool to guide conservation and land use decision-making since 2004 (Botts et al., 2020). Remote sensing has revolutionised our ability to map and track land cover change with relatively standardised classification algorithms (Wulder et al., 2018), and has been used to develop at least one near-real-time change detection system for some ecosystems in the CFR (Moncrieff, 2022). However, existing remote sensing analyses in the CFR typically only provide information on complete habitat loss, and rarely tell us anything about the relative condition of the habitat that remains, hindering accurate ecosystem threat assessments (Skowno et al., 2021).

The CFR is predominantly an open ecosystem, with few trees, dominated by fire-dependent shrublands (Fynbos) that show high dynamism in space and time due to steep environmental gradients, seasonality, natural disturbances such as fire and drought, and long-term post-disturbance recovery trajectories (Slingsby et al., 2014). Detecting any signal of change over and above this natural variability is highly challenging (Slingsby et al., 2017; Slingsby, Moncrieff, & Wilson, 2020), and while few existing satellite change detection and time-series monitoring algorithms (e.g. Woodcock et al., 2020 and references therein) have been tested in the region, they are unlikely to be able to account for this extreme variability. Fortunately, Wilson et al. (2015) developed a hierarchical Bayesian model that accounts for the dynamism, describing the observed vegetation activity (Normalised Difference Vegetation Index [NDVI]) as a function of time since fire and time of year. The model fits a negative exponential function to describe the trajectory of NDVI recovery after fire, and includes a sine term to capture the seasonality. It is hierarchical, because the parameters of the function are regressed against a set of environmental covariates describing topography, soils and climate. The advantage of this regression step is that it allows the postfire recovery trajectories to be predicted for any location with known environmental conditions, or for future projected conditions.

Slingsby, Moncrieff, Rogers, et al. (2020) used Wilson et al.'s (2015) model as the basis of a proof of concept for near-real time change detection in the CFR. Any deviation of observed NDVI from the expectation under natural conditions, represented by the distribution of posterior probabilities of the mean predicted by the model, is flagged as a change in the ecosystem's condition. Exceedance above the maximum NDVI typically occurs when alien woody plants invade (usually from the genera Pinus, Eucalyptus or Acacia). Deviation below the minima can result from fire, vegetation clearing or plant mortality due to pathogens or drought. While the model is useful in its current form, and can be automatically rerun to update outputs, it is not specified as an iterative near-term ecological forecast (sensu Dietze et al., 2018), because it does not iteratively fuse new observations with the model, losing the benefits of a formalised learning feedback loop. Similarly, Ma et al. (2022) tested a set of deep learning methods on the Slingsby et al. dataset with good results forecasting NDVI for up to a year using a convolutional long short-term memory (ConvLSTM) model. While this method is useful for forecasting the expected vegetation signal, and can learn from new observations, it does not easily facilitate ecological learning, because the model has low explainability. This may soon change as explainable AI is an active and rapidly developing research field (Samek et al., 2019).

The post-fire recovery model of Wilson et al. (2015) would be better suited for ecological forecasting and iterative improvement and/or linkage with other models if it were reframed as a state-space model (see comprehensive guide by Auger-Méthé et al., 2021). State-space models are applied to dynamic time-series problems that involve unobserved (latent) variables or parameters describing the evolution in the state of the underlying system. Latent states are connected to observations with an observation model that describes their relationship and accounts for bias and error. State-space models allow you to estimate parameters of the deterministic process, observation error and process error from the time series. By specifying the post-fire recovery model as a state-space model, it could be adapted to describe the trajectory of multiple ecosystem properties. For example, vegetation activity or health could be the latent (unobserved) state, informed by NDVI or other measurements, each with their own observation model to account for their relationship to the state of interest and their observation (or measurement) error (Figure 2). In this formulation, it is not necessary to precompute vegetation age as a covariate that regulates the post-fire recovery of vegetation activity. By specifying that vegetation activity follows a Gompertz or Logistic growth curve, the latent state at time t would depend only on the state at t − 1 and parameters that describe the shape of the curve. These parameters can be modelled as a function of environmental conditions as in the existing formulation of Wilson et al. (2015). Fire reduces vegetation activity by a fixed or variable amount, and at each time-step there is a probability of fire occurrence, which can be modelled by including another latent variable with its own process and observation model (e.g. Preisler and Benoit, 2004) or from an external estimate of burn probability (see Section 2.3). To explore this, we have developed a preliminary state-space model in STAN (Carpenter et al., 2017) and run it for a small sample of 40 sites (Figure 2; details in Supplementary Material; code and demonstration data available at https://doi.org/10.5281/zenodo.7271331). This state-space framework allows assimilation of new observations and updating of forecasts based on filtering algorithms such as the Kalman filter or sequential Monte Carlo, and would likely result in more accurate forecasts of future vegetation state.

Details are in the caption following the image
A generic framework for a state-space model of postfire vegetation recovery (a). A logistic or Gompertz process model captures the behaviour of the initial rapid post-fire vegetation growth and its eventual saturation. The parameter vector W is modelled as a function of environmental variables, borrowing across pixels/sites as in Wilson et al. (2015). The probability of fire at each time-step Zt could also be modelled as a latent state, or input from an external fire behaviour model. An example time series (b) shows the median estimated state (black line) with 90% CI (dark shaded area) and observations (red points). Here fire is specified as a binary variable without an observation model. Forecast future states are shown in the light grey shaded area. Dashed vertical lines indicate the occurrence of fire, and in the case of the forecast period the simulated occurrence of a fire. Large drops in NDVI are predicted when fire occurs.

2.2 Ecosystem function

Many of nature's contributions to people in the CFR relate to above-ground vegetation biomass (e.g. carbon storage, fire risk) and stream flow (e.g. water provisioning, flood risk), which are interrelated through processes such as rainfall interception, evapotranspiration, flood attenuation and infiltration. While many of these are state variables that can be modelled in unison and derived using Earth System Models (Fisher et al., 2018), it would be exceedingly difficult to modify and run them at a spatial and temporal scale that is useful for decision-makers in the region. The Dynamic Global Vegetation Models (DGVMs) that are typically used to drive the vegetation component of Earth System Models further constrain their utility in the CFR and neighbouring biomes, because they struggle to characterise important aspects of the dominant plant functional types (e.g. shrubs, succulence, CAM-photosynthesis) and key processes like crown fires (Moncrieff et al., 2015). Even though the DGVMs are typically considered to be ‘mechanistic’ models, they are tuned to ensure they adequately capture properties like the overall distributions of biomes or carbon storage (Fisher et al., 2018; Moncrieff et al., 2015). DGVM predictions for Mediterranean-Type Ecosystems (MTEs) like the CFR, parts of Chile, California, Australia and the Mediterranean are generally poor, because they do not represent a large part of the global carbon cycle or land surface and thus are not prioritised for model calibration. Narrowing down in scale to commonly used landscape-level models like LANDIS (Scheller & Mladenoff, 2004) and derivatives are not always appropriate, because these are designed and optimised for forest landscapes, while the CFR and much of the rest of the MTEs are open ecosystems with few trees. Other approaches for modelling multiple land surface states and fluxes in unison and at high resolution, like the Land Information System software framework (Kumar et al., 2006), have yet to be tested in the CFR. A positive spin-off of these challenges is that the CFR may be one of the best places to work on improving these models due to a plethora of available field data, an active ecological community and engaged practitioners. Focusing efforts on the less-well-studied parts of the world, and in particular places where global models do not perform well, is a key step in identifying important regional variation. Models that capture important mechanisms in the CFR should improve our ability to understand similar ecosystems like the MTEs and other shrublands in general, and eventually translate into improved DGVMs and land surface models. The development of a shared and open ecoinformatics pipeline (Figures 1 and 3) would greatly facilitate the testing and improvement of existing approaches and development of new models for the CFR and similar ecosystems.

Details are in the caption following the image
An automated ecoinformatics pipeline to download, standardise, process and serve data for the EMMA project. A user-specific domain is provided to a set of download functions, which retrieve data from various sources (e.g. MODIS NDVI (Didan, 2015) and burned area (Giglio et al., 2015), CHELSA climate data (Karger et al., 2016), soil layers (Cramer et al., 2019), cloud data (Wilson & Jetz, 2016) and others). These layers are processed to standardise their projection and resolution and then used to generate desired data products in the formats required for modelling. They are automatically updated weekly, but the time-step could be adjusted, re-running portions of the workflow where needed and serving both the open-source code and data via Github releases (https://github.com/AdamWilsonLab/emma_envdata/releases). The workflow, which runs in a docker image, can be run locally, on a dedicated server, or even via Github actions and can be adapted to other ecosystems or regions with few changes. All code is available at https://doi.org/10.5281/zenodo.7272296.

When considering state variables independently, water provision is perhaps the most important ecosystem service for the local economy, and supply for agriculture, industry and residential use comes primarily from rain-fed reservoirs. Unfortunately, these have become less reliable with population growth, increasingly erratic rainfall (Pascale et al., 2020) and excessive water use by alien tree infestations in the mountain catchments (Le Maitre et al., 2019; Moncrieff et al., 2021). Increased interception and evapotranspiration by alien trees are estimated to reduce Western Cape Water Supply yield by >6% each year and will likely worsen to >20% without effective control (Le Maitre et al., 2019). Thus, water supply represents a critical connection between ecosystem change and human needs in the region, and forecasts are key for assisting decision-makers in weighing up the merits and/or quantifying the costs and benefits of invasive alien plant control operations. Approaches for quantifying the hydrological consequences of alien plant invasions in the region vary in complexity and scale, usually with a trade-off between the two. Fully specified hydrological models incorporate detailed hydrological processes, but are data-hungry and typically only operate at the catchment scale (e.g. MIKE-SHE; Rebelo et al., 2022). Quantifying impacts across the whole CFR requires a landscape hydrology approach that can operate over many catchments and handle sparse datasets, but are less realistic in the processes they include (e.g. Le Maitre et al., 2019). Recently, Moncrieff et al. (2021) recast Le Maitre et al. (2019) approach in a Bayesian framework, allowing propagation and analysis of uncertainty in the estimates. Unfortunately, the method is aimed at estimating the anticipated impacts in an average climatological year across a plausible set of post-fire vegetation ages and is currently not well suited for forecasting.

Beyond testing and refining existing models developed elsewhere for application in the CFR, there are several opportunities and potential benefits of improving our ability to understand the relationship between ecosystem status (Section 2.1) and the hydrological and carbon cycles (this section) to allow us to integrate across drivers to produce probabilistic forecasts for multiple variables. For example, the state-space model outlined in Figure 2 is based on a reformulation of the hierarchical Bayesian model of Wilson et al. (2015). It uses satellite-derived observations of NDVI to inform the vegetation state and estimate parameters. Numerous other ecosystem states and properties in the CFR and other fire-prone ecosystems follow a similar postfire recovery trajectory and thus could be modelled individually using a similar process model, but a different observation model. Above-ground biomass or leaf area index (LAI) are obvious examples that require only minor modification of the model. One could even integrate multiple observations collected at different frequencies and with different sources of error (e.g. LAI from ceptometers, field-measured biomass, satellite-derived vegetation indices or canopy height and cover fraction models from light detection and ranging). Key to their integration is the development of observation models that link diverse measurements to the latent vegetation state (e.g. Wilson et al., 2011). The state-space framework allows integration of multiple observations collected at different frequencies into the estimation of a common latent vegetation state. The benefit would be that observations collected frequently and extensively, though only loosely related to the vegetation state (e.g. NDVI), could be combined with observations that are difficult to obtain yet are more direct measurements of the latent state (e.g. field measured biomass). While one could use the forecasts from one model (e.g. for LAI) as inputs for another model (e.g. evapotranspiration), perhaps more compelling would be to integrate multiple different variables of interest into a single model by specifying multiple latent variables with interrelationships between them, each dependent on different (or overlapping) observations with their own observation models (Auger-Méthé et al., 2021; McClintock et al., 2017).

2.3 Fire modelling

Wildfire can result in tremendous property damage (Kraaij et al., 2018; Moritz et al., 2014), but is also an essential driver of biodiversity and ecosystem function across large parts of the planet (Bowman et al., 2009; He et al., 2019). Current fire-management practices are typically focused on identifying and reducing fire hazard and risk, but rarely take into account the role of fire in maintaining biodiversity. From an ecological management perspective, there are two separate, but related, problems: knowing the probability of a fire occurring at a location in the next time-step versus maintaining and anticipating changes in the natural fire regime (frequency, timing, severity, intensity) required to support desired biodiversity and ecosystem function. In fire-dependent ecosystems like the CFR, most plant species and communities depend on fire for their long-term persistence (Kruger & Bigalke, 1984). Where fire is excluded, species that exhibit fire-dependent seed release or germination cannot complete their life cycles (Le Maitre & Midgley, 1992), and fire-sensitive species (e.g. forest) may invade and shift the vegetation towards low-diversity fire-free alternative ecosystem states (Manders & Richardson, 1992; Slingsby, Moncrieff, Rogers, et al., 2020). Conversely, where fire is too frequent or badly timed, species that are killed by fire and have not had time to mature and set seed may be driven locally extinct. Unfortunately, knowing and maintaining the parameters of the fire regime suitable for maintaining species and ecosystems is not always straightforward, because they vary between species and with location (Magadzire et al., 2019). Similarly, human activities have direct and indirect impacts on fire regimes that can be difficult to anticipate or detect (Slingsby, Moncrieff, Rogers, et al., 2020).

To date, fire in the CFR has been managed by ‘rule of thumb’, based on our understanding of the demography of serotinous shrub species in the family Proteaceae. Local management authorities apply the rules that ‘No fire should be permitted in fynbos until at least 50% of the population of the slowest-maturing species in an area have flowered for at least three successive seasons (or at least 90% of the individuals of the slowest maturing species in the area have flowered and produced seed). Similarly, a fire is probably not necessary unless a third or more of the plants of these slow-maturing species are senescent (i.e. dying or no longer producing flowers and seed)’ (CapeNature, n.d.). This is a useful starting point, and essentially sets the acceptable range of fire return intervals based on the needs of the flora present, but is of limited utility where no serotinous Proteaceae occur, and cannot be used to forecast future fire occurrence or fire regimes. It also makes major assumptions about the acceptable demographic rates required to sustain populations. Demographic models with fully specified uncertainties could offer a vast improvement (see Section 2.4).

Inferring current and forecasting future fire regimes is critical for accounting for the effects of fire on biodiversity in global change research and for spatial land use planning. The estimates of current and future fire return intervals most commonly used in modelling studies in the CFR (Magadzire et al., 2019; Merow et al., 2014; Treurnicht et al., 2020) are derived from Wilson et al's (Wilson et al., 2015) post-fire recovery model described above. Here the fire return interval was estimated from the rate of NDVI recovery post-fire (representing fuel accumulation), which was found to be a good predictor of observed fire regimes, and can be forecast under future climate scenarios. Another approach has been spatial interpolation from a survival model based on a multi-decadal database of observed fires (Moncrieff et al., 2021; Wilson et al., 2010). Unfortunately, these approaches do not account for changes in sources of ignition or the many factors that affect fire spread. Slingsby, Moncrieff, Rogers, et al. (2020) addressed this by proposing a mechanistic approach to explore the spatial probability of fire occurrence based on the notion of ignition catchments, using changes in the factors that influence the spatial extent and temporal range where an ignition is likely to result in a site burning to estimate change in the fire regime. They aggregated and compared repeated fire spread simulations from a mechanistic model with different land cover inputs to develop spatial estimates of the change in the probability of fire occurrence between 1750 and 2008. This revealed that expansion of urban areas had altered fire spread, creating anthropogenic fire shadows with little or no fire and negative consequences for biodiversity. This approach can be extended to include changes in fuel properties (e.g. invasive alien plants, moisture content, post-fire vegetation age) or climate, weather, ignitions and their relative timing. While the method used produced static maps and did not adequately quantify uncertainty, the conceptual framework can be applied with other fire risk modelling methods that account for the spatial relationship among modelled locations, including state-space models as outlined in Figure 2 (e.g. Preisler and Benoit, 2004).

Ultimately, managing fire in the CFR requires forecasts that can tell us about where and when we may expect to experience fires, how the fire regime may be changing, and how biodiversity and ecosystem function may respond. Furthermore, forecasts aimed at managing biodiversity or nature's contributions to people often require fire information (or fire forecasts) as inputs. This strongly argues for coupling or integration of models. For example, the forecasts of vegetation state in Figure 2 require an estimate of the probability that a fire will occur in a given timeframe (determined by the time difference between steps in a time-discrete state-space model) at a particular location. This could be extracted from any number of different fire models, including statistical models (e.g. Wilson et al., 2010) or mechanistic models (e.g. Slingsby, Moncrieff, Rogers, et al., 2020). A more sophisticated approach would be to then feed the updated vegetation state into the fire model, resulting in closer coupling between the fire and vegetation forecasts, or to develop an integrated model that can forecast both fire and vegetation state.

2.4 Plant demography and distribution

Globally, there is much focus on the conservation and management of individual species (Jetz et al., 2019; Mace et al., 2008). The models described above focus on community to ecosystem-level processes and do not account for dynamics of populations or the distributions of species. Beyond assessing risk of species' extinction (Mace et al., 2008; Raimondo et al., 2009), and informing conservation priorities (Skowno et al., 2019), there are a number of policy and management decisions in the CFR that require information on individual species, both indigenous and alien. These include fire management, managing and monitoring wildflower harvesting (Treurnicht et al., 2021), managing invasive alien plants (Van Wilgen & Wilson, 2018), and monitoring and planning for the impacts of climate change (Schurr et al., 2012).

The Proteaceae are the best-studied and understood indigenous plant family in the CFR and are used as model organisms or indicator species for many conservation policy and management applications (Schurr et al., 2012). Extensive locality, demographic and life-history data have been, and continue to be, collected by conservation authorities (CapeNature and SANParks), citizen scientists (Protea Atlas Project and iNaturalist) and researchers, allowing the parameterization of a large suite of models to date. Our knowledge of the demography of Proteaceae has been used to create rules of thumb for the direct management of vegetation in the CFR in two ways. First, at the ecosystem level, to help determine acceptable fire return intervals (see Section 2.3 above). Second, at the species level, for setting guidelines for sustainable wild harvesting of their inflorescences. The wildflower harvesting rule is that: ‘[there should be no] harvesting until at least 50% of the population had commenced flowering, a harvest of up to 50% of current season flower heads after this stage, and no harvesting at least one year prior to a prescribed burn’ (Van Wilgen et al., 2016). Both rules are based on the premise that maintaining seed banks is the key to the persistence of Proteaceae populations—that is, that there is a large enough seed bank present when a fire occurs for the population to recover. These rules have several important limitations. Observation and modelling studies have shown that they do not account for interspecific differences or intraspecific variation across the species' ranges (Merow et al., 2014; Treurnicht et al., 2021). The total number of seeds produced per individual, the viability and recruitment success of those seeds, individual survival and other factors vary between species and with environmental conditions. Interannual variation is also relevant, because conditions during seed production or the establishment phase vary widely from year to year (Slingsby et al., 2017), such as during the recent severe drought in the region (Pascale et al., 2020). Population models have also revealed flaws in the accepted wisdoms. For example, species that persist through fire events by resprouting are generally believed to be less reliant on seed banks, but a recent range-wide population viability analysis of 26 Proteaceae species revealed that resprouters can be just as vulnerable to wildflower harvesting (and thus seed loss) as species that are killed by fire and recruit from seed (Treurnicht et al., 2021).

For alien species, managers often require estimates of the distribution, age, density and identity of invasive species to help plan and monitor control operations (Van Wilgen & Wilson, 2018). Efforts to address this need in the CFR have been primarily limited to correlative species distribution models, with reasonable results predicting the broad-scale potential range of invaders (Van Wilgen & Wilson, 2018). Recently, there has been great interest and promise in the use of satellite remote sensing to map and monitor invasive species (Royimani et al., 2019). Existing efforts in the CFR are still too coarse to meet user needs (Holden et al., 2021), but the increasing availability of time series of high-resolution imagery is making it feasible to map medium to large trees and shrubs (Slingsby & Slingsby, 2019), and new data types like aerial and satellite imaging spectroscopy (Cavender-Bares et al., 2022; Cawse-Nicholson et al., 2021) should be very helpful in this regard. The NASA Biodiversity Survey of the Cape campaign (www.bioscape.io) that is currently underway in the Greater CFR should provide a good test of the utility of this technology for delivering on the region's biodiversity information needs.

To date, the demographic analyses described above have been limited to summaries of past observations or models projected for scenarios multiple decades into the future, and thus not designed for continuous monitoring or near-term forecasting. As with most aspects of the ecology of the CFR, plant population demography is closely linked to fire and post-fire recovery cycles and thus demographic predictions for most species require this additional information. Demographic models could be recast as iterative ecological forecasting models that draw from both ecosystem-level values informed by remote sensing or other model outputs (such as the fire risk or post-fire biomass accumulation) and individual-level observations of plant growth, survival and reproduction (Briscoe et al., 2019). While we do not have space to discuss their relative merits and challenges, there are plenty of examples of process-explicit population and species distribution models that could be suitable for developing forecasts at the level of detail required by managers in the CFR, and that allow for complex imputation of varied data from multiple sources and integration across submodels (Briscoe et al., 2019; Isaac et al., 2020; Jones et al., 2021; Lamonica et al., 2021).

3 DEVELOPING AN INTEGRATED REGIONAL ECOLOGICAL FORECASTING SYSTEM

3.1 Creating and feeding the data pipeline

Iteratively updated ecological forecasts depend upon continuously updated data products. Maintaining an ecoinformatics pipeline requires much effort, but can be simplified through automation, and better justified or supported if one open collaborative community cyberinfrastructure is developed and used to feed multiple models (Fer et al., 2021; Ramachandran et al., 2021). The EMMA project (www.emma.eco) provides one example (Figure 3), implemented using free and open-source software with the targets package for R (Landau, 2021; R Core Team, 2020) to automatically generate a standardised set of data products on a weekly basis using accessible, scalable and transparent tools. While the pipeline is currently geared towards running Wilson et al's (2015) post-fire recovery model as part of the change detection system proposed by Slingsby, Moncrieff, Rogers, et al. (2020), the data products could easily be fed into the state-space models described above, or many others. Similarly, the underlying targets framework is highly flexible, allowing the pipeline to be modified to ingest and process new data sources such as from other sensors or field surveys. Using open collaborative community cyberinfrastructure and shared data products among multiple models and working groups saves time and effort, and makes the model outputs more comparable, reducing inconsistencies introduced by the use of different model inputs (Fer et al., 2021). Ideally, where multiple analogous datasets exist, they should all be ingested and their effects on forecasts explored. Similarly, where the datasets have direct relevance for decision-makers, it is relatively simple to generate and deliver automated summaries and reports for different stakeholders. The existence of a shared ecoinformatics pipeline would also assist in identifying available datasets, or where key datasets needed to drive forecasts (e.g. climate, soils) are missing or require improvement. For example, available global data products and forecasts vary in their spatial and temporal resolution (e.g. NASA Global Modelling and Assimilation Office (GMAO), NOAA Global Ensemble Forecast System (GEFS) the European Centre for Medium-Range Weather Forecasts), and may be too coarse for local needs, or are based on little or no data from the focal region of interest (e.g. SoilGrids; Hengl et al., 2017). Another major challenge is that many data products lack information about their uncertainty, limiting our ability to fully quantify uncertainty in forecasts that use them as input variables. As ecological forecasting gains traction, there will be increasing need for developing methods and interdisciplinary collaboration to generate suitable input data products that are downscaled and/or optimised for forecasting in the region of interest (e.g. Cramer et al., 2019) and include fully quantified uncertainties (e.g. Wilson & Silander Jr, 2014).

Of course, feeding the ecoinformatics pipeline will often require more than just downloading ready-made data products. While hard-earned field data are the gold standard for many ecological forecasts, they are often costly and time-consuming to collect, not to mention irregular in time and space and with their own errors and biases. To attain the spatial and temporal resolution required for many forecasts, we will need to increasingly rely on data from sensor networks (streamflow, weather, cameras, etc.), citizen science networks and remotely sensed imagery (Keitt & Abelson, 2021). These may require intermediate processing steps to clean data or extract variables of interest, such as population counts from aerial imagery (e.g. Slingsby & Slingsby, 2019) or vegetation fuel properties from remote sensing (e.g. Arroyo et al., 2008). Fortunately, much of this can be automated with machine or deep learning methods that are constantly improving in accuracy and increasingly accessible to non-technical users through open-source software (e.g. https://github.com/azavea/raster-vision; Yuan et al., 2020). Similarly, routine data transformations and quality checks can be automated using continuous integration/continuous deployment (CI/CD) tools like Github actions (Kim et al., 2022). While some of these data products may currently not be good enough to drive forecasts on their own, advances in data integration allow them to be used to constrain models in time-steps between more comprehensive data inputs (e.g. field surveys), and they may be directly useful for decision-making or scenario development (Dietze, 2017; Isaac et al., 2020). As the forecasts are developed, they can be used to direct data collection efforts by scientists, citizen scientists and conservation management agencies to most efficiently target and reduce sources of uncertainty under given resource constraints.

3.2 Progressive development and integration of forecasts

Key to developing an integrated regional ecological forecasting system is the sustainability of the project. This requires ensuring buy-in from collaborators/developers, stakeholders and funders, which depends on demonstrating value early in the development of the project. Fortunately, this can be relatively straightforward if one incrementally co-develops the ecoinformatics pipeline with stakeholders and adopts open and reproducible research principles. Even without operational forecasts, the direct provision of up-to-date and easily digestible summaries of raw data or existing data products can be of great value to decision-makers, while analysis-ready datasets facilitate easy model development, scientific inquiry and standardised comparison of modelling outputs. One can then incrementally add forecast models, individually at first, building towards having multiple forecast models, where some may share input data or require outputs from other forecast models as inputs, and eventually developing more complex coupled or integrated models. Once we begin coupling or integrating models for the region, it will facilitate learning about feedbacks, emergent phenomena and trade-offs between the properties of interest to ecologists and decision-makers (Carriger et al., 2019; Chisholm, 2010; Higgins et al., 1997).

Developing multi-pronged, regionally focussed iterative near-term ecological forecasts, with explicit comparison or coupling of models across scales, will create new opportunities to learn about biodiversity and provide novel information to inform decision-making. Viewed together, the models described above present a compelling opportunity to integrate across multiple scales of organisation and improve our understanding and management of this ecosystem. Ultimately, the overarching goal is to include all the state variables in combined models so that we can capture covariances and explore emergent phenomena and feedbacks across scales. Given that the spatial–temporal forecasts we envision involve estimating multiple state parameters across the ~90,000 km2 of the CFR (~150 million landsat pixels), achieving this is likely to be limited by available methods and resources (e.g. computing power!). For example, the necessary data fusion is only likely to be achievable using Bayesian models, but traditional MCMC estimation would be very slow for models of this size and complexity. While we are seeing rapid advances in computational tools and efficient methods like Approximate Bayesian Computation (Beaumont, 2019), we would most likely have to start with simplified models and smaller datasets.

Perhaps the biggest challenge to developing and maintaining regionally focussed iterative near-term ecological forecasts is buy-in from stakeholders and the development of the requisite skills among those who would benefit from utilising or contributing to these initiatives. While co-developing the system with decision-makers may improve their buy-in (e.g. www.emma.eco/workshop), there will most likely need to be training provided for the use of the tools developed, or where resources are limited, the provision of devices (laptops, smartphones, etc.) to access the tools. Similarly, developing ecological forecasts and contributing to the ecoinformatics pipeline requires advanced statistical training and data science skills. Fortunately, undergraduate- and graduate-level quantitative biology and data science courses are becoming increasingly commonplace at universities, and instructors on these courses typically adopt open science principles and make their teaching materials freely available online (e.g. the many ecological forecasting materials listed at https://ecoforecast.org/resources/educational-resources/). That said, the sustainability of any ecological forecasting system will strongly depend on maintaining dedication to training, stakeholder engagement and inclusivity. For example, in the CFR, co-production of management-relevant ecological science relating to the information needs above goes back 100 years or more, and there is a long history of management agencies having embedded scientific staff, strong links with universities and other research institutions, and in-house training (Van Wilgen et al., 2016). Unfortunately, the former South African government was not inclusive of the diverse peoples of the country and not all voices and interests were heard. Ultimately, this was at least partly responsible for the loss of government support for ecological research programmes and in-house capacity at conservation management agencies when South Africa became a democracy (Slingsby et al., 2021; Van Wilgen et al., 2016). Fortunately, the CFR boasts a transdisciplinary learning network of scientists, decision-makers and other stakeholders who have met annually for more than four decades to discuss the co-production of knowledge underpinning conservation efforts in the region (The Fynbos Forum; Gelderblom & Wood, 2019). While not necessarily intentionally inclusive, the forum originated as an informal association that was less subject to government regulation and meetings have been multiracial since the 1970s, allowing greater representation of needs and ideas and smoothing the transition into a democratic South Africa. Newer initiatives like the Southern African Program on Ecosystem Change and Society are focused on promoting transdisciplinary and engaged research that is resilient or even responsive to social and governance changes (https://sapecs.org; Biggs et al., 2022).

4 CONCLUSIONS

Managing natural resources through environmental change is a challenging endeavour. The iterative ecological forecasting framework can be highly valuable to decision-makers by enabling near-term predictions about how the system will respond to change. Unfortunately, operational ecological forecasts typically require extensive expertise and resources and will be difficult to develop and support in resource-constrained regions of the world. Here we have proposed that developing an integrated ecological forecasting system for a focal region should help overcome these logistical constraints, using the CFR of South Africa as a case study.

There is great potential for iterative near-term ecological forecasts in the CFR, and many of the building blocks in terms of understanding stakeholder needs, provisional models, an ecological modelling community and data streams are reasonably well developed. Ecological forecasts spanning spatial, temporal and biological scales should help overcome logistical constraints, while creating opportunities to learn about key issues in this region and ecology and forecasting in general, such as scale dependence, feedbacks or emergent phenomena (Dietze, 2017). It would also provide a flexible and diverse toolbox for assessing the trade-offs between and relative merits of alternative decision options and how interconnections across scales may generate unanticipated or potentially undesirable outcomes. For example, in South Africa, ecosystem threat assessments and environmental impact assessments for authorising development projects typically only consider the biodiversity value of the portion of habitat that would be transformed (Botts et al., 2020; Skowno et al., 2021). There would be great value in being able to explore how land use and spatial planning decisions in one location may, for example, alter the probability of wildfire at others, and how this may affect species, ecosystems and nature's contributions to people across the landscape. Coupling forecasts with an economic model could further assist in justifying different management strategies (e.g. Higgins et al., 1997), or evaluating trade-offs between ecosystem services (Chisholm, 2010; e.g. Carriger et al., 2019).

Developing regionally focussed ecological forecasting efforts spanning spatial, temporal and biological scales is a pragmatic approach to dealing with logistical, technological and funding challenges, especially in under-resourced scientific environments. They would help unify existing efforts, build stronger communities and help bridge the science-management-policy divide, and aid in guiding the deployment of limited resources for maximum ecological benefit (e.g. through shared infrastructure, targeted data collection informed by models). They would also facilitate international collaboration by attracting the attention of the global modelling community. Open ecoinformatics pipelines that deliver analysis-ready data allow rapid model development and testing of the transferability of new or existing models and theory across regions (Dietze, 2017; Lewis et al., 2022). Ultimately, this should facilitate the development and application of the best possible models for each region. As the tools and skills for developing collaborative community cyberinfrastructure grow and mature, it should become easier and cheaper to transfer existing ecoinformatics pipelines, and modify and maintain them for new regions. Promoting ecological forecasting in under-resourced regions is critical, because they often harbour the richest biodiversity and face great management challenges, providing fertile ground for ecological forecasts to have impact. While the requisite skills and resources will often be limiting, the rapid growth of the global ecological forecasting community and the increasing availability of free open online resources should help fill this gap. The inclusion of the diverse ecologies and perspectives they have to offer has great potential to make valuable contributions to global ecological and forecasting theory.

AUTHOR CONTRIBUTIONS

All authors conceived the ideas and designed the methodology; Jasper A. Slingsby led the writing of the manuscript; Glenn R. Moncrieff led the development of the state-space model; Brian Maitner and Adam M. Wilson led the development of the EMMA ecoinformatics pipeline. All authors contributed critically to the drafts and gave final approval for publication.

ACKNOWLEDGEMENTS

The authors thank Simcelile Chenge for the isiXhosa translation of the abstract, and Andrew Skowno, Yingjie Hu and two anonymous reviewers for feedback on earlier versions of the manuscript. J.A.S. and G.R.M. were supported by the National Research Foundation of South Africa (Grant Nos. 150296, 118593 and 142438). A.M.W. and B.M. were supported by NASA BioSCape (80NSSC21K0086) and NASA Ecological Forecasting Team Applied Sciences Program (80NSSC21K1183). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

    CONFLICT OF INTEREST

    The authors have no conflicts of interest to declare.

    PEER REVIEW

    The peer review history for this article is available at https://publons.com/publon/10.1111/2041-210X.14046.

    DATA AVAILABILITY STATEMENT

    Data and code used for fitting the state-space model are available on Github at https://doi.org/10.5281/zenodo.7271331. Code for the EMMA ecoinformatics pipeline is available on Github at https://doi.org/10.5281/zenodo.7272296.