State‐and‐transition simulation models: a framework for forecasting landscape change

A wide range of spatially explicit simulation models have been developed to forecast landscape dynamics, including models for projecting changes in both vegetation and land use. While these models have generally been developed as separate applications, each with a separate purpose and audience, they share many common features. We present a general framework, called a state‐and‐transition simulation model (STSM), which captures a number of these common features, accompanied by a software product, called ST‐Sim, to build and run such models. The STSM method divides a landscape into a set of discrete spatial units and simulates the discrete state of each cell forward as a discrete‐time‐inhomogeneous stochastic process. The method differs from a spatially interacting Markov chain in several important ways, including the ability to add discrete counters such as age and time‐since‐transition as state variables, to specify one‐step transition rates as either probabilities or target areas, and to represent multiple types of transitions between pairs of states. We demonstrate the STSM method using a model of land‐use/land‐cover (LULC) change for the state of Hawai'i, USA. Processes represented in this example include expansion/contraction of agricultural lands, urbanization, wildfire, shrub encroachment into grassland and harvest of tree plantations; the model also projects shifts in moisture zones due to climate change. Key model output includes projections of the future spatial and temporal distribution of LULC classes and moisture zones across the landscape over the next 50 years. State‐and‐transition simulation models can be applied to a wide range of landscapes, including questions of both land‐use change and vegetation dynamics. Because the method is inherently stochastic, it is well suited for characterizing uncertainty in model projections. When combined with the ST‐Sim software, STSMs offer a simple yet powerful means for developing a wide range of models of landscape dynamics.


Introduction
The world is composed of landscapes, natural and humaninfluenced, that are heterogeneous in space and time. Simulation models can provide valuable insights into the dynamics of these landscapes, including improving our understanding of how these landscapes change and, in turn, providing forecasts of their future state (Baker 1989;Sklar & Costanza 1991;Veldkamp & Lambin 2001).
Since the early 1970s, a wide range of spatially explicit simulation models have been developed to understand and forecast landscape dynamics. First, there are landscape vegetation models, developed principally by ecologists, which focus on predicting landscape-scale changes in vegetation in response to ecological drivers such as climate, biophysical conditions and disturbances (Baker 1989;Keane et al. 2004;. While many of these landscape vegetation models have been developed for specific questions or regions (Keane et al. 2004), a few have been generalized sufficiently to become modelling platforms, including SELES (Fall & Fall 2001), TELSA (Kurz et al. 2000) and, for forested systems, LANDIS Wang et al. 2014). Secondly, there are land-use/land-cover (LULC) change models, developed principally by geographers, where the focus is to represent the effects of human-driven processes on LULC change (Agarwal et al. 2002;Verburg et al. 2006;Brown et al. 2013). Examples of more general LULC change modelling platforms include CLUE-S/Dyna-CLUE (Verburg et al. 2002;Verburg & Overmars 2009), SLEUTH (Chaudhuri & Clarke 2013), DINAMICA EGO (Soares-Filho, Cerqueira & Pennachin 2002) and CA_MARKOV (Pontius & Malanson 2005).
Existing models of landscape dynamics share common features, many of which we believe can be captured in a generalized landscape modelling framework. Such a framework would reduce the duplication of efforts across modellers and foster innovation, communication and collaboration across the landscape modelling community. The approach we present here has emerged from our involvement in the development of landscape vegetation models for a range of ecological systems, including forests, rangelands, wetlands and land-use change (Wilson et al. 2014). Our method, which we refer to as a stateand-transition simulation model (STSM), can be characterized as follows: (i) space is represented as a set of discrete spatial units; (ii) time is represented in discrete steps; (iii) the change over time in the discrete state of each spatial unit is represented as a stochastic process; and (iv) time-inhomogeneous rates of change between states are expressed as probabilities.
Why the new term 'state-and-transition simulation model'? While STSMs share some common features with Markov chainswhen the Markov chains are referenced spatially in the manner proposed by Baker (1989) the differences between Markov chains and our method are significant enough to warrant a different term. As will be described below, the principles behind an STSM were developed specifically to overcome some of the limitations of Markov chains for modelling landscape change. Secondly, while STSMs also share features with cellular automata models (Balzter, Braun & Kohler 1998;White & Engelen 2000), there are also significant differences between STSMs and cellular automata in their representation of spatial interactions. Furthermore, terms such as 'state-and-transition model' and 'transition model' have been widely used in the literature without a clear, formal definition as to their meaning, including references to conceptual diagrams (Stringham, Krueger & Shaver 2003), Markov chain models (Acevedo, Urban & Ablan 1995) and other forms of stochastic processes (Keane et al. 2004). Thus, the term 'STSM'which was first used by Czembor & Vesk (2009)serves to distinguish our specific approach from other more ambiguous terms.
The objective of this paper is to present the details of our STSM framework. We begin with a description of the STSM method, followed by a brief overview of the software available to develop STSMs. A case study example is then presented demonstrating some of the key elements of STSMs. We conclude with a brief discussion of how STSMs relate to other modelling approaches, and opportunities for further enhancements to the STSM framework.

STSM approach
Like most approaches to spatially explicit landscape modelling, the first step in developing an STSM is to divide the landscape spatially into a set C of n simulation cells; these cells can be any shape and size, although they are most commonly represented as a regular raster grid. An STSM represents the change in state of each simulation cell over time as a discrete-time stochastic process {X t : t ≥ 0}, where the state space is a set S of r discrete state types (X t 2 S) and t represents discrete timesteps. As in a Markov chain, probabilities are defined for one-step transitions between states for each cell. In a Markov chain, these one-step transition probabilities are specified by defining the probability, P ij , that the state of the stochastic process for each cell, X t , will move from state type i to state type j, where i,j 2 S. These probabilities are then represented for each cell as an r by r transition matrix P = (P ij ).
In many landscape models, however, it is important to distinguish the different types of transitions between states, something that is not possible with Markov chains. For example, in a forested system, there can be multiple processes responsible for transitions between states, such as succession, wildfire and timber harvest, which are all combined into a single transition probability between any two states in a Markov chain representation. With an STSM, however, a set U of m discrete transition types are also defined for all cells, with a separate transition matrix P defined for each possible transition type k 2 U. The nonzero entries across all P are referred to as transition pathways. So, in our forested system example, separate probabilities can be represented for each processa separate transition matrix for succession, wildfire and timber harvest.
A single Monte Carlo realization of an STSM begins by setting initial values for the state type of each cell (i.e. assigning values to X 0 ) and then using the transition probabilities, P ij , to simulate the state of each cell, X t , for every successive timestep. To accommodate the requirement for multiple transition types in an STSM, within each timestep the transition matrices, P, are applied sequentially for each transition type k 2 U in order to update the state of each cell between timesteps. As more than one type of transition can occur within a timestep, the order in which the transition matrices are applied within each timestep can also be specified in an STSM, as this order will influence the results of a simulation.
A second difference between STSMs and Markov chains is that with STSMs 'counters' can be defined as additional state variables for each cell. Each counter is a positive integer random variable, measured in units of timesteps, that is initialized for every cell in a simulation and then incremented by one for every timestep. The motivation for including counters as additional state variables in STSMs is twofold: first, to report these as model outputs and secondly, to allow transition probabilities to be defined as a function of the value of these counters.

A G E A N D T I M E -S I N C E -T R A N S I T I O N
Counters are most commonly used to track the age and timesince-transition (TST) for each cell, where TST refers to the number of timesteps that have elapsed since one or more transition types last occurred. To capture this in a Markov chain, the state space for X t could be expanded to include all possible combinations of state type, age and TST. Formulating a stochastic process this way, however, results in an unmanageably large state space for even the simplest models. STSMs instead track each counter (in addition to the state type X t ) as a separate discrete-time stochastic process (Fig. 1).
In order to use counters to track age, each cell is assigned an initial age (A 0 ) at the start of the simulation, and this age is then updated every timestep using the following rules: (i) if a transition occurs for the cell, then a corresponding probability distribution is used to determine the fate of the cell's age; these probability distributions for the change in age can vary as a function of the state of the cell (i.e. its state type, age and TST) and the type of transition; (ii) if no transition occurs for the cell, then the age of the cell is incremented by 1. The TST is tracked in a similar way and repeated for each transition type.

T R A N S I T I O N T A R G E T S
Another limitation of Markov chains is that transitions between states must always be characterized in terms of a probability of occurrence. However, there are some transitionsoften those that are management-orientedthat are more appropriately expressed as a target for the area to be transitioned over time, rather than as a probability. With STSMs, transitions can be characterized using either probabilities or target areas, where targets are dynamically converted into transition probabilities during the simulation; this can be calculated if one knows the full state of landscape at the time of the transition. STSMs also allow transition targets to be specified for other derived variablese.g. the volume of timber harvestso long as these variables can be expressed as a function of the STSM state variables (i.e. state type, age and TST) and transition types.

S P A T I A L A N D T E M P O R A L H E T E R O G E N E I T Y
Because the state variables in STSMs are random variables, there is some inherent variability in when and where transitions occur in any one realization of a model. However, there are often situations in which additional temporal variability in transition probabilities is required in order to adequately capture the dynamics of some processes. For example, the annual transition probabilities for wildfire in the model of Fig. 1 might be better represented as a random variable, rather than a single value, reflecting the pattern of interannual variability in the expected amount of wildfire on the landscape due to changes in climatic conditions. In addition to temporal variability, the transition probabilities in STSMs can also vary spatiallyit is possible to vary the transition probabilities for every cell and timestep in the landscape. Continuing the example of Fig. 1, one might expect wildfires to occur in patches (i.e. with a defined spatial autocorrelation). To accommodate this requirement, the transition probabilities in an STSM can also be expressed as random variables for each cell and timestep; as a result, it is ultimately possible to generate any spatial and temporal pattern of transitions on the landscape. Importantly, the transition probabilities for any one cell can be a function of the past and current state of the entire landscape, as represented by the values of the state variables associated with all of the landscape's cells (Fig. 2).
To simplify the representation of spatial and temporal variation in transition probabilities, a two-step process is often used with STSMs. First, a base probability for each transition type is defined as a random variable. Transition multipliers are then specified in order to scale the base probabilities over space and time. These multipliers are defined as a stochastic process for each transition type and cell, the distribution of which can also be space and time inhomogeneous. During a simulation, the realized values for the transition probabilities are the product of the realized base probabilities and transition multipliers, with concurrent probabilities renormalized such that the sum of all probabilities does not exceed 1. For example, to reproduce historical wildfire patterns in an STSM, the base wildfire probability is often estimated as a function of the long-term mean fire cycle, while the relative frequency distribution of area burned each year can be used to estimate the multipliers (e.g.

Software
A key component of any modelling framework is its supporting software. While the concepts behind STSMs are simple enough that they could be coded from scratch in a spreadsheet, a robust software environment leads to more efficient model development, particularly for larger, more complicated models. To this end, a software product called ST-Sim, first released in 2013, has been created to support the development of STSMs (ApexRMS 2016). While ST-Sim is based on concepts originally developed in the TELSA software platform (Kurz et al. 2000), there are some important differences between TELSA and ST-Sim. First, ST-Sim is consistent with the general STSM framework outlined in this paper, while TELSA was developed as a specific model to support forest management in British Columbia, Canada. Secondly, TELSA uses a polygon-based representation of space, while ST-Sim is raster-based. This shift to a raster-based approach, along with other data management and multiprocessing extensions, has allowed ST-Sim to handle simulations across larger (i.e. >10 6 cells) landscapes (e.g. Costanza et al. 2015b). Finally, ST-Sim users can integrate external models (e.g. developed in R or Python) to dynamically generate transition probabilities within each timestep of a simulation, a key element of the framework. Details on how to download the ST-Sim software are provided in Appendix S1 (Supporting information).

Case study example
To illustrate the STSM method, we present a simple model of the dynamics of LULC for the state of Hawai'i (USA). The purpose of this modelling effort is to explore the interactions between possible future changes in LULC, combined with projected shifts in plant communities due to climate change, on the future spatial and temporal pattern of LULC across the state of Hawai'i. It is important to note that the description of the model and presentation of results here is purposefully brief, considering only a single 'business-as-usual' future scenario, in order to remain focussed upon the relationship between various features of the model and the STSM method presented above, rather than the broader context for the model's development and the interpretation of results. Additional details regarding the model parameterization, including ST-Sim software files, can be found in Appendices S1 and S2.

S T A T E V A R I A B L E S A N D S C A L E S
The spatial extent for this model is the terrestrial portion of the state of Hawai'i, covering 16 416 km 2 . The landscape was divided spatially into simulation cells, each of which is 1 9 1 km in size. Simulations were run for 50 years, with an annual timestep, using initial conditions corresponding to the year 2011; all simulations were repeated for 100 Monte Carlo realizations.
As in all STSMs, each cell in the simulation is characterized according to a suite of state variablesall of which are random variablesthat are calculated for each simulation timestep. The first state variable is the state type of each cell: a total of 21 possible state types were defined for this model, consisting of all unique combinations of seven LULC classes (Grassland, Shrubland, Forest, Plantation, Agriculture, Developed and Barren), crossed with three possible moisture zones (Dry, Mesic and Wet). The second state variable is the age of each   As with all STSMs a set of all the possible transition pathways between state types is defined (Fig. 3). The processes represented by these transitions include expansion and contraction of agricultural lands, urbanization, wildfire, shrub encroachment into grassland, harvest of tree plantations and shifts in moisture zones due to climate change. The order in which the one-step transition probabilities for each of the transition types are applied is randomized for every timestep and Monte Carlo realization. Transitions due to agricultural expansion, agricultural contraction and urbanization are modelled using STSM transition targets. In order to represent the historical temporal variability in land-use change, the simulated annual transition areas are sampled for each year and Monte Carlo realization from a uniform distribution fitted to the corresponding historical land change data (Table 1). STSM transition multipliers are used to further characterize the spatial pattern of these transitions; based upon existing zoning maps (State of Hawaii 2015), static transition multipliers were generated to (i) restrict agricultural expansion to areas zoned for agriculture; (ii) prevent agricultural contraction from occurring in important agricultural areas; and (iii) prevent urbanization from areas zoned for conservation, and double the relative transition probability of urbanization in areas zoned as either urban or rural. In order to 'spread' transitions over time, transition multipliers are also generated (using an external model), for each cell, timestep and realization, such that (i) for agricultural expansion and urbanization, the relative transition probability increases linearly (from 0 to 1) as a function of the proportion of adjacent cells that are agriculture or developed, respectively; (ii) for agricultural contraction, the relative transition probability increases linearly (from 0 to 1) as a function of the proportion of adjacent cells that are forest/shrubland/grassland, and the relative probability of a cell transitioning to forest vs. shrubland vs. grassland increases linearly (from 0 to 1) as a function of the number of adjacent cells in each of these classes.
A separate wildfire submodel is integrated into the STSM to determine which cells incur wildfire transitions for each simulated year and realization (details in Appendix S2). This submodel aims to reproduce the spatial and temporal pattern of historical wildfires. The results of this submodel are then used to dynamically assign transition probabilities of 1 for those cells that transition each year and 0 for all other cells. The wildfire submodel generates two sets of these transition probabilities for each timestep and realization, one for each of two classes (high and low) of fire severity.
Shrub encroachment into grassland from neighbouring shrubland is hypothesized to occur in the absence of regular fires; however, there is ecological uncertainty regarding if or how long it might take for this shrub encroachment to occur. To capture this behaviour in the model, shrub encroachment is represented as follows: (i) transitions are restricted to cells in states associated with the Grassland LULC class where the time-since-fire is at least 10 years and the moisture zone is Dry or Mesic; (ii) at least one of the cell's eight neighbours must be in a state associated with the Shrubland LULC class; and (iii) the annual transition probability is sampled from a uniform distribution for each realization, where the bounds of this distribution were set to 0Á006 and 0Á0327, corresponding to a  cumulative transition probability of 0Á95 from 10 to 500 and 50 years without fire, respectively. Tree plantations in Hawai'i are generally harvested starting at the age of 5 (Whitesell et al. 1992), although the stand age at harvest is quite variable, due principally to economic uncertainties (J. Jacobi, personal communication). To capture this dynamic, plantation harvest is modelled as follows: (i) transitions are restricted to cells in states associated with the Plantation LULC class where age is at least 5 years; (ii) harvest transitions reset the age to 0; and (iii) the annual transition probability is sampled from a uniform distribution for each year and realization, where the lower and upper bounds of this distribution were set to 0Á031 and 0Á259, corresponding to a cumulative transition probability of 0Á95 from age 5 to 100 and 15, respectively.
Finally, output from an existing analysis of the effect of climate change on shifts in moisture zones (Fortini, Jacobi & Price in press) was integrated into the STSM. The 100-year projections for the area that will transition between moisture zones were converted to annual transition targets for each of the four moisture zone transition types (Table 1). While no temporal variability was modelled for these transitions, transition multipliers were used to restrict the location of these transitions to cells within the zones predicted by Fortini, Jacobi & Price (in press) for each transition type.

I N I T I A L I Z A T I O N
The initial state type of each cell (i.e. in year 2011) was estimated by combining existing 30-m resolution maps of the LULC class (U.S. Geological Survey 2011) and moisture zone (Fortini, Jacobi & Price in press) for the state of Hawai'i, which were then aggregated using a majority algorithm to a 1-km resolution (Fig. 4). The same initial state was used for all realizations of the model.
The initial age for cells associated with the Plantation LULC class is modelled as a random variable with a uniform distribution between 0 and 100 (i.e. the maximum harvest age). The initial time-since-fire for cells in states associated with the Grassland LULC class is also modelled using a uniform distribution with a lower bound of 0. The upper bound of this distribution was set to the fire cycle (Van Wagner 1978); fire cycles were estimated from historical fire data (Eidenshink et al. 2007) as 69 and 502 years for the Dry and Mesic moisture zones, respectively. The initial age and TST of each cell are resampled for each realization of the model.  (Buckland 1984), or as maps averaged over timesteps and realizations.
Our case study provides a sample of the type of output that can be generated with STSMs; note, however, that because we present only a single future scenario, the results shown here are a scenario projection for LULC in Hawai'i, and not a prediction of the likely future state. In our case study scenario, we see that levels of agricultural expansion/contraction and urbanization are projected to match the historical levels from Table 1 at least for the first 30 years (Fig. 5); this is to be expected, as we set the levels using transition targets. Agricultural contraction decreases beyond this point, however, due to an eventual shortfall in agricultural land. In contrast, the area projected to transition due to wildfire, plantation harvest and shrub encroachment emerges as a function of the dynamic state of the landscape, as these transitions were parameterized using probabilities. Variability is projected to be greatest for wildfire transitions, as our wildfire submodel was configured to reproduce the historical annual pattern of wildfire variability. The variability for other transitions, however, is likely underestimated given the limited historical data used to characterize their future variability. From Fig. 6, we see the projected spatial pattern for transitions: urbanization, for example, spreads out from existing developed areas, while wildfire occurs with greater probability in the dry grassland communities. Figures 7-9 summarize the corresponding state of the landscape that emerges as a result of these projected transitions. For our case study scenario, we see a projected loss of grassland and agricultural areas, and a corresponding increase in shrubland and developed areas, due to the combined effects of all transitions represented in the model (Figs 7 and 8). Uncertainty is greatest for the projections of future grassland and shrubland area, due to the high uncertainty associated with the future rate of shrub encroachment. Finally, we see a shift from Mesic to both Dry and Wet moisture zones due to the projected effects of climate change (Fig. 9).

Discussion
A key feature of STSMs is their combination of simplicity and generality: while STSMs are rooted in the intuitive principles of Markov chains, they have important adaptations that make them applicable to a wide range of landscape management questions. A second key feature is their explicit representation of uncertainty, an important consideration for most modelling applications. The Hawai'i case study presented here demonstrates these key features. The projections for states and transitions are expressed as distributions, rather than simply mean values, thus incorporating, through Monte Carlo simulations, the combined uncertainties of multiple model inputs . While several of the transition rates projected by the case study model purposefully match historical distributions, the added value of reproducing these rates in the STSM is that one is then able to reflect the combined consequence of multiple model inputs, including their uncertainties, on the future projections for model outputs. Note that, for this simple example, we did not attempt to account for all model input uncertainties during simulations, and as such, our results should be considered as only a single sensitivity analysis of the modelled system. The moisture zone analysis, in particular, has large uncertainties that were not modelled, highlighting the challenges of developing STSMs using the results of analyses that themselves do not incorporate uncertainty.
STSMs differ from other spatial modelling methods in a number of ways. In contrast to cellular automata (Balzter, Braun & Kohler 1998;White & Engelen 2000) and spatially interacting Markov chains (Baker 1989;Monticino, Cogdill & Acevedo 2002), STSMs track multiple state variables for each simulation cell. STSMs also differ from Markov chains in that they allow for multiple transition pathways between pairs of states and for transitions to be specified as target areas. Other methods differ from STSMs in that they track continuous rather than discrete state variables. One such example is coupled map lattices (Kaneko 1992), the continuous state variable equivalent of cellular automata (Fonstad 2006). Another example is the LANDIS model, which for most applications tracks either biomass (Scheller & Lucash 2014) or the number of trees (Wang et al. 2014) by species and age class as its state variables. The other major difference between STSMs and LANDIS is that LANDIS has been designed for use specifically with forested systems, relying on tree species life-history traits to drive its dynamics, whereas STSMs are not specific to any particular vegetation community. LANDIS has thus been used principally for applications where details regarding individual tree species by age cohort may be important (e.g. management of uneven-aged forests), and for which there is sufficient life-history data to parameterize the model. STSMs, on the other hand, typically track only the forest community (i.e. species assemblages) and age as state variables, similar to the approach used in most timber supply models (e.g. There are also similarities between STSMs and LULC change models. Like STSMs, LULC change models use a discrete representation of space, time and state. In general, LULC change models divide the simulation process into two steps every timestep (Mas et al. 2014;National Research Council 2014). First, they calculate the total amount of change between states over one or more regions; for example, several models do this by applying a Markov chain to historical data in order to generate a matrix of transition probabilities between states (Soares-Filho, Cerqueira & Pennachin 2002;Pontius & Malanson 2005). Next, the models allocate this total change spatially across cells based on each cell's suitability for change; the suitability of each cell is often calculated as a relative probability, which can be influenced by both external factors and neighbourhood interactions. STSMs are well suited to capture these same dynamics. The amount of change between states can be represented through the specification of either target areas (i.e. top-down demand) or probabilities (i.e. bottom-up conversion) for transitions. As shown in the case study example, the distribution of these transitions can be made to follow any spatial or temporal pattern using the STSM transition multiplier feature. These multipliers can be calculated as a function of both external drivers and the state of neighbouring cells. A major difference between STSMs and most LULC change models, however, is the generality of the STSM method; STSMs focus on those elements of landscape dynamics that are common to most applicationssimulating changes in the state of spatially referenced discrete random variables over time as a function of probabilistic transitions. For example, the STSM method does not include specific routines for estimating transition probabilities over time, nor does it include routines for statistically fitting spatial suitability relationships or representing cell interactions. Rather, these tasks are accomplished through the development of external models, which are then used to generate STSM transition probabilities, either dynamically or a priori. There are several benefits of this generality: (i) a single, intuitive framework can be used for a wide range of applications, including modelling both vegetation dynamics and land-use change; (ii) STSMs are inherently stochastic, providing a framework for capturing uncertainty throughout the modelling process; and (iii) STSMs provide a common framework within which alternative approaches/ models can be compared.
State-and-transition simulation models currently have some limitations. The first is that STSMs are only able to track discrete state variables. While there are many systems for which this limitation is a reasonable and often useful approximation, there are other systems and questions for which continuous state variables may also be required; extending the STSM framework to include continuous state variables is an area we are actively pursuing. A second limitation with STSMs is the absence of any capability to integrate agent/individual-based models (Grimm & Railsback 2005;Matthews et al. 2007); these models have become an increasingly important approach for representing certain drivers of landscape dynamics (DeAngelis & Mooij 2005; National Research Council 2014). We believe that future efforts should explore possible ways to integrate these two approaches. Finally, while STSMs provide the opportunity to characterize model uncertainty, through the expression of states and transitions as random variables, characterizing this uncertaintyin particular the covariance between transition probabilitiesremains an important challenge.