Tree phenology responses to winter chilling, spring warming, at north and south range limits
Summary
-
Increases in primary production may occur if plants respond to climate warming with prolonged growing seasons, but not if local adaptation, cued by photoperiod, limits phenological advance. It has been hypothesized that trees with diffuse-porous xylem anatomy and early successional species may respond most to warming. Within species, northern populations may respond most due to the fact that growing seasons are relatively short. Species most sensitive to spring temperature may show little overall response to warming if reduced chilling in fall/winter offsets accelerated winter/spring development.
-
Because current thermal models consider only highly aggregated variables, for example degree-days or chilling units (temperature sums for a season or year), they may not accurately represent warming effects. We show that assumptions contained in current thermal (degree-day) models are unrealistic for climate change analysis. Critical threshold parameters are not identifiable, and they do not actually have much to do with thresholds for development. Traditional models further overlook the discrete nature of observations, observation error and the continuous response of phenological development to temperature variation. An alternative continuous development model (CDM) that addresses these problems is applied to a large experimental warming study near northern and southern boundaries of 15 species in the eastern deciduous forest of the USA, in North Carolina and Massachusetts.
-
Results provide a detailed time course of phenological development, including vernalization during winter and warming in spring, and challenge the basic assumptions of thermal models. Where traditional models find little evidence of a chilling effect (most are insignificant or have the wrong sign), the continuous development model finds evidence of chilling effects in most species.
-
Contrary to the hypothesis that northern populations respond most, we find southern populations are most responsive. Because northern populations already have a compressed period for spring development, they may lack flexibility to further advance development. A stronger response in the southern range could allow residents to resist northward migration of immigrants as climate warms. If potential invaders fail to exploit a prolonged growing season to the same degree as residents, then there is a resident advantage.
-
Hypothesized effects of warming for xylem anatomy and successional status are not supported by the 15 species in this study.
Introduction
Climate warming will increase productivity of forests if trees can exploit longer growing seasons (Goulden et al. 1996; Keeling, Chin & Whorf 1996; Chuine & Beaubien 2001; Nemani et al. 2003; Baldocchi et al. 2005; Churkina et al. 2005; Richardson et al. 2009). On the one hand, geographic variation (Badeck et al. 2004; Yang et al. 2012) and advances in budbreak with warming trends over time (Schwartz 1998; Menzel & Fabian 1999; Fitter & Fitter 2002; Davi et al. 2006; Menzel et al. 2006; Van Vliet & Wielgolaski 2006; Cleland et al. 2007; Jeong et al. 2011; Wang et al. 2011; Yang et al. 2012) both suggest that growth will respond to warming unless offset by limited moisture. On the other hand, photoperiod cues can dampen phenological advance (Wareing 1956; Ashby et al. 1992; Mimura & Aitken 2007; Aldrete, Mexal & Burr 2008; Lopez et al. 2008; Körner & Basler 2010; Cooke, Eriksson & Junttila 2012). The interactions and conflicting evidence from different types of data (Parmesan 2007; Diez et al. 2012; Wolkovich et al. 2012; Yang et al. 2012) have frustrated efforts to predict responses to climate change. We implement a hierarchical model of phenological response to warming that accounts for the sources of uncertainty. Application of a new approach to a large experimental warming study shows the extent of the response for each species, northern vs southern seed sources within species, depending on temperature variation over the previous seasons. We evaluate species responses in the context of different ecological traits, such as xylem anatomy and successional status, and population differences resulting from seed sources from northern and southern portions of the range.
Traditional thermal models (including degree-day models) can misrepresent how warming affects phenology, in part because they collapse temperature time series into a cumulative sum or mean value for a year or season. The temperature threshold in these models defines onset of phenological development. Such a mechanism could protect vulnerable tissues from exposure to late frost. The cumulative sum of temperatures above the threshold, degree-days (DD), is taken to be a requirement that must be fulfilled before a discrete phenological event (e.g. budbreak) happens. Threshold warming and chilling parameters in thermal models (e.g. Chuine 2010; Polger & Primack 2011; Cook, Wolkovich & Parmesan 2012; Cooke, Eriksson & Junttila 2012) are hard to quantify on the basis of a single event like time of budbreak (Hunter & Lechowicz 1992; Hänninen 1995; Chuine, Cour & Rousseau 1998; Bailey & Harrington 2006; Allen et al. 2013). Interactions involving chilling followed by subsequent warming are even harder to model (Hänninen 1995; Hänninen, Slaney & Linder 2007). For example, Cook, Wolkovich & Parmesan (2012) hypothesized that warming has offsetting effects, delaying a chilling requirement to break dormancy while accelerating spring development. They hypothesize that if this interaction is general, the largest physiological effects of warming might be experienced by species that show the least phenological effect. However, the analysis did not find evidence of chilling requirements for 80% of species tested. A recent study that treats budbreak as a time-to-event process (as opposed to linear regression), also using degree-days as the predictor (Terres et al. 2013), could not converge on both threshold parameters, requiring instead a model-selection approach, that is with fixed parameter values. We begin by showing why aggregating temperature variation into a single number (date of budbreak or flowering) precludes parameter estimation and, more generally, why such thresholds do not actually represent the onset of development.
A second limitation of current models concerns treatment of data. Phenological states are observed at intervals, but models interpret them as transitions, often as continuous variables (but see Terres et al. 2013; Allen et al. 2013 for binary treatment). Phenological observations at intervals of days to several weeks do not document when an individual reached a given state, only the interval between dates when it occurred. Thus, not only are observations ordinal, they are also interval-censored. Finally, we are unaware of models that allow for errors in the developmental process and in the observations.
Understanding how climate change will impact phenology could benefit from two innovations, (i) accommodation of the discrete, ordinal and censored nature of observed events and (ii) an underlying process that admits the continuous nature of phenology development. Modelling (unobserved) underlying development from discrete observations allows us to exploit the advantages of state-space representations within a hierarchical framework (Calder et al. 2003; Clark et al. 2011a). The model developed here, a continuous development model (CDM), allows us to directly quantify and test the effects of environment on spring phenology.
Firstly, we consider the role of timing, but unlike a previous analysis of spring development (Clark et al. 2014b), we focus on the relationship between spring warming vs chilling requirements. The importance of timing is expected to vary among species. Early successional species might risk early budbreak due to already unreliable conditions for colonizing species (Körner & Basler 2010), although effects of successional status are not obvious in the large study of Lopez et al. (2008). The capacity to advance budbreak could be limited in species having ring-porous xylem because leaf expansion follows vessel formation (Wang, Ives & Lechowicz 1992; Sass-Klaassen, Sabajo & den Ouden 2011). However, ring-porous Quercus and Fraxinus sometimes appear more responsive to warming than diffuse-porous species (Vitasse et al. 2009).
Secondly, we examine the role of latitude, from northern and southern ranges for eastern temperate tree species. Timing of warming and species attributes could interact in different ways at different latitudes (Cooke, Eriksson & Junttila 2012). Migration in response to climate change could be enhanced if northward-migrating species benefit from warming more than resident competitors (Ibáñez, Clark & Dietze 2009). Individuals in the northern part of their range might also suffer negative carbon balance in years with short growing seasons (Morin et al. 2009) and thus might risk early budburst when spring temperatures permit. Such close climate tracking puts the individual at increased risk of late frost, but that risk could be necessary when living near a physiological limit. Finally, we evaluate the hypothesis that the largest effects of warming could be missed because they entail offsetting responses in fall and spring (Cook, Wolkovich & Parmesan 2012).
The application we describe is experimental. Observational studies have the advantage of comparatively low cost and, in some cases, long duration (Menzel et al. 2006; Polger & Primack 2011; Allen et al. 2013). Remote sensing can provide spatial perspective, but does not control for species composition; when year-to-year timing of budbreak varies with latitude and elevation, it may be unclear whether climate gradients or varying species abundance is the cause (Badeck et al. 2004; Hwang et al. 2011). More generally, observational data do not control for source population or for correlated climate variables (Diez et al. 2012; Clark et al. 2014b). They do not address the effects of warming outside the variation that occurred during the observation period. Experiments provide opportunity to control for population source and to isolate the effects of warming from other sources of variation. A recent report that experiments underestimate the effect of warming on phenology (Wolkovich et al. 2012) came from an analysis that applied different metrics to the two types of studies. We are unaware of evidence to indicate the experiments systematically underestimate warming effects (Clark et al. 2014a,b,c).
We use a continuous development model with experimental data to show why the thermal inefficiency assumption of degree-day models can lead to inconsistent results. A previous paper introduced this new approach, focused on why experiments and observational studies could be misinterpreted and showing that there is a period in late winter when many species are most sensitive to warming (Clark et al. 2014a,b,c). Here, we provide technical background on the model and use it to evaluate the relationship between warming effects and chilling requirements and how they may differ among species and geographically. To motivate this approach, we begin by showing why existing degree-day models have limited application to climate change. We find a relationship between responses to spring warming and winter chilling that differs from previous studies. Finally, we evaluate differences in development rates of northern vs southern species and northern vs southern seed sources of the same species, in xylem anatomy, and successional stage.
Why thermal (degree-day) models for spring phenology are misinterpreted
The basic degree-day model


where number of days or cumulative daily temperatures below threshold T′C are counted, beginning at a date in autumn, after onset of dormancy and continuing until an unobserved date ci, when the chilling requirement CUi is assumed to be satisfied. Some models accumulate warming DDs only after the chilling requirement is satisfied (d0 ≥ ci); others accumulate warming DDs and chilling units simultaneously (d0 < ci) (Linkosalo, Häkkinen & Hänninen 2006). Because the date where such a transition might occur is unobserved, it is difficult to find a biological justification for either assumption (Cooke, Eriksson & Junttila 2012).
Degree-day models are increasingly used not only to explain geographic variation in onset of the growing season (Ibáñez et al. 2010; Yang et al. 2012; Terres et al. 2013) but also to quantify responses to climate variation (e.g. Menzel et al. 2006; Wolkovich et al. 2012; Polger & Primack 2011). If the degree-day model is accurate, then plots of budbreak dates against DD in different temperature regimes should be identical (Fig. 1b, top), despite differences in timing of budbreak (Fig. 1a, top). Conversely, phenology will not advance with warming if adaptive mechanisms against late frost damage cued to photoperiod oppose early budbreak (Fig. 1a, bottom).
If phenology responds to temperature in different ways at different phenological stages (Campoy, Ruiz & Egea 2011), then the responses could be more complex than the dichotomy represented by the hypotheses in Fig. 1. In other words, a specific value of DDi does not discriminate between a warm winter followed by a cool spring, or vice versa. If phenological development varies in sensitivity over time, we should not expect these two temperature sequences to result in budbreak at the same DDi value. We test the hypothesis that warming accelerates phenology as predicted by the degree-day model, but first discuss why the degree-day model can miss important responses to climate change.

Sources of confusion
When temperature varies (in contrast to agronomic germination studies where temperature is held constant, e.g. Welbaum, Tissaoui & Bradford 1990; Bradford & Still 2004), the threshold T′ is difficult to identify, because only the sequence of temperatures Ti is known in eqn. 1. Timing of onset d0 is not known. Including a cooling threshold temperature T′C and unknown termination of cooling ci worsens parameter identifiability problems (Terres et al. 2013; Clark et al. 2014a). Even so, estimates of chilling and warming thresholds are widely reported in the literature.
In fact, parameter identifiability can be viewed as only part of a larger problem, the fact that degree-day models do not reflect the biological assumptions that most biologists attribute to them. A motivation for the threshold temperature centres on conditions required to break dormancy, as low temperatures satisfy a chilling requirement and then rise above a threshold for development in winter/spring. Adaptive explanations focus on delayed onset (DO) of development, which could reduce risk of frost damage. In Appendix S1 (Supporting information), we show that degree-day models are in fact insensitive to events early in the year when temperatures are close to hypothesized thresholds. Instead, the dominant effect in the model is to change the effect of temperatures when they are above the threshold. A high threshold in the model reduces the effect of all temperatures above that threshold. On the other hand, a high threshold has little effect on the degree-days that are counted near the time when development is assumed to begin, on day d0 in eqn. 1. A number of hypotheses have been suggested to explain why a threshold temperature might be adaptive for delaying onset of development (e.g. Campoy, Ruiz & Egea 2011, Cooke et al. 2012). However, we are unaware of biological justification for the assumption that is actually built into the DD model, that a high threshold for onset of development reduces the effect of temperatures on development rate after the threshold is passed.
Methods
In this section, we extend the approach of Clark et al. (2014a) that allows for responses to continuous weather variation, the continuous development model (CDM), followed by a description of experimental methods.
Model development
Consider continuous phenological development of an individual plant, initiated by fulfilment of a chilling requirement followed by warming temperatures in winter/spring, and that proceeds irregularly due to fluctuating daily temperatures, its current developmental state, and traits of the individual plant, such as genetic variation associated with seed source. While this development is continuous, it is observed only as recognizable discrete states at times when plants are observed. In fact, the state changes themselves are not discrete, nor is the precise timing of changes between them. Clark et al. (2014a) developed a hierarchical Bayes model to coherently integrate discrete observations of a continuous process.
Appendix S2 describes the three levels of our CDM, each having a different role. The first level consists of discrete, ordinal observations of phenological states, Siy,t, with values 1–6 corresponding to no bud activity (1) through full leaf expansion (Norby, Hartz-Rubin & Verbrugge 2003). The observations are obtained on individual i in year y and day t. All variables have the same subscripts; so, we hereafter omit them. The second level consists of ‘true’ ordinal states s, which progress monotonically from 1 through 6 as a result of underlying continuous development h(t), a continuous function of time t (in days). The separation of observation S from true state s allows for observation error, the fact that an observer cannot always assign precisely the same discrete state to traits that vary continuously. The observed stages need not be monotonic, but the true states are monotonic. The continuous developmental state h(t) constitutes level three, which allows for the fact that development responds continuously to fluctuating temperatures and other factors in the environment and endogenous to the plant. Diagnostics are contained in Appendix S3.
To evaluate the effects of temperature on budbreak, we quantify its effects in combination with other variables in the model in Appendix S4. The quantity γk is a sensitivity coefficient describing the effect of a unit change in temperature on the rate of progress towards stage k = 1,…, 6 (e.g. budbreak is stage k = 3; Table 1). There is a value of γk for each individual and time. We use index γ to identify times of high sensitivity, that is when temperatures have a large effect on phenological development.
Stage | Hardwoods | Conifers |
---|---|---|
1 | No visible bud swelling | No bud expansion |
2 | Bud swelling | Stem elongating |
3 | Bud break has occurred | Needles have emerged from sheathes |
4 | Leaves unfolding | Needles partly elongated |
5 | Leaves open, not fully expanded | Needles mostly elongated but unhardened |
6 | Fully expanded | Needles hardened |
Experimental design and data collection
Experimental warming was implemented near the southern and northern range limits of North America's eastern deciduous forest (Fig. 2). The southern site, Duke Forest (DF), North Carolina (36·0° N, 79·1° W; elevation 180 m), has a mean annual temperature of 14·5 °C and annual precipitation of 1208 mm, nearly all of which falls as rain. The northern site, Harvard Forest (HF), Massachusetts (42·5° N, 72·2° W; elevation 340 m), has a mean annual temperature of 7·5 °C with 1183 mm of precipitation. At Harvard Forest, snow cover is common from December to March.

The experiment used a factorial design to provide replication for all combinations of three temperatures (ambient, ambient +3 °C and ambient +5 °C assigned randomly) in a total of 18 open-top chambers per site. The 17-m2 rectangular chambers were gridded into 14 columns and 30 rows at 15-cm spacing, resulting in 420 planting locations. Thus, each chamber supports two 4·6 m × 1·05 m heated areas. Chamber walls consist of transparent plastic greenhouse sheeting attached to wooden frames, with heights of 2·5 m. Six additional control plots had no chamber.
Soil and air were heated with independent systems to track ambient temperatures with the consistent +3 °C and +5 °C offsets. Soil was heated with electric resistance cable buried 10 cm deep at 20-cm spacing (Melillo et al. 2002). Temperature offsets were maintained by an automated tracking system. Ambient chambers received buried cables as a disturbance control, but they were not heated. Air was heated indirectly with propane. Heated water with non-toxic propylene glycol was pumped to a heat-exchange coil and airflow system in +3 and +5 chambers. Air circulating through the ambient chambers was not heated. The temperature offset was maintained by a defined heat delivery rate. Environmental variables monitored in each chamber included soil (10 cm deep) and air temperatures (30 cm above-ground). Control of soil temperature in the heated chambers was excellent (±), whereas control of the air temperature was less precise, in part due to air scooping on windy days. The combinations of chambers, sites, treatments and years provide substantial variation in temperatures.
Cohorts were established annually from seed, beginning in 2009, obtained from sites across eastern North America. For purposes of seed provenance analysis, they were classified as north or south of 40°N latitude (Table 2). Seeds from each source were planted in equal numbers at each site. Planting occurred at the times of seed dispersal for each species in the mineral soil horizon at grid locations. Each year existing cohorts were amended with new planting, replacing losses due to death and selective harvest of some individuals for analysis. Seedlings naturally recruited from the surrounding forest were also marked and monitored at both sites.
Species | Xylem anatomy | Number of trees | Tree years | Number of populations | |
---|---|---|---|---|---|
Southern | Northern | ||||
T | |||||
Magnolia grandifolia | D | 69 | 153 | 2 | 1 |
T, C | |||||
Betula papyrifera | D | 123 | 359 | 0 | 1 |
Liquidambar styraciflua | D | 124 | 383 | 4 | 0 |
Liriodendron tulipifera | D | 311 | 933 | 6 | 1 |
Nyssa sylvatica | D | 148 | 339 | 5 | 3 |
Pinus palustris | C | 40 | 105 | 3 | 0 |
Pinus taeda | C | 283 | 838 | 3 | 0 |
T, C, T × C | |||||
Betula alleghaniensis | D | 344 | 950 | 0 | 3 |
Pinus resinosa | C | 62 | 196 | 0 | 1 |
Pinus strobus | C | 241 | 685 | 2 | 3 |
T, C, S | |||||
Quercus alba | R | 980 | 2659 | 5 | 2 |
T, C, T × C, S, T × S | |||||
Fraxinus americana | R | 161 | 321 | 4 | 3 |
T, C, S, T × S | |||||
Acer rubrum | C | 1649 | 4260 | 4 | 3 |
Acer saccharum | D | 191 | 391 | 5 | 3 |
Quercus rubra | R | 246 | 622 | 4 | 4 |
- Xylem anatomy codes are D – diffuse porous, R – ring porous, C – conifer. Model codes are T – temperature, C – chilling units, S – seed source. Species are listed in order of model complexity selected by DIC.
Weekly censuses enabled us to quantify germination, demographic and phenological responses to warming in each experimental year (2009–2012). The opening of buds and development of leaves in the spring were scored on a scale of 1 (no bud activity) to 6 (fully hardened and expanded leaves, Table 1) using the Norby Scale (Norby, Hartz-Rubin & Verbrugge 2003). The analysis omits the germination year, to avoid effects that could be related to the precise timing of planting and germination.
Species were selected to include those that are dominant at one or both sites and to span the range of xylem anatomy (ring porous, diffuse porous and conifer) and successional statuses (Table 2). No experimental manipulation of temperature can accommodate enough species to test for the many ways in which different functional traits control phenology. Our large study of > 4000 surviving seeds and seedlings includes species representative of the different functional classes in Table 2, but recognizes them as representative only.
Analysis
The CDM was fitted jointly for all individuals of each species having sufficient sample size. Due to low germination rates, survival, or both, not all species could be analysed. Seed source was included in the model only for species having more than 10 surviving individuals of both northern and southern seed sources (Table 2). The sample sizes in Table 2 refer to post-germination year individuals. A joint posterior distribution was simulated using Markov chain Monte Carlo (MCMC)(Appendices S3, S4). Development occurs on days above the threshold temperature. However, the threshold influences only when development can occur and not the magnitude of the effect – it has no contribution to thermal efficiency. Thus, unlike degree-day models, the precise value used did not affect results.
Diagnostics involve model checking through in-sample prediction. Model selection is based on the Deviance Information Criterion (DIC). We evaluate the model based on its ability to predict the entire seasonal development of phenology for each individual, starting from the initial state of each individual (s = 1, h = 1) in spring.
Comparison with traditional degree-day models
To compare our results with a traditional degree-day model, we regressed the response for budbreak date against growing degree-days with chilling units of the preceding winter. The literature includes many different regression models – we followed the approach of Cook, Wolkovich & Parmesan (2012) as a recent example, using the DD sum for the first 3 months of the year. Unlike their study, we did not standardize DD in the regression (we did not divide DD by its standard deviation), which makes the response dependent on the range of values that happened to occur during a specific study and species. Because each species could experience a different range of temperatures, the scale for standardized coefficients is not comparable across species. Unstandardized coefficients have the common scale across species in days per °C. Our use of two geographically separated sites and temperature treatments ensured a large range of DD as inputs.
Results
Environmental variation and experimental control
Field sites at Duke Forest and Harvard Forest maintained a 5–10 °C temperature difference, the difference being greatest in summer, both above- and below-ground (Fig. 3a,b). At the end of pretreatment year 2009, warmed chambers tracked ambient at +3 °C and +4 °C for air and +3 °C and +5 °C for soil. Air temperatures for Harvard return to ambient over winter when snow cover is present, and air is unheated (Fig. 3a). Winter temperatures varied substantially between years, with chilling greatest at Harvard Forest in 2010–11. We used the average of air and soil chilling units as inputs for the model.

Inference
Inference yields estimates of discrete and continuous states, s and h, observation errors in discrete states, effects of predictors on development in parameter vector β, and parameters that connect discrete and continuous states (Appendix S2). For clarity, we begin with the dynamic context, summarizing relationships over time and between observations, followed by estimates and model diagnostics.
Figure 4 shows relationships between the latent developmental state h(t) for an individual tree compared with the observed discrete states Sk, k = 1,…,6. Development begins in state 1 (Fig. 4a). The continuous developmental state h is estimated with uncertainty (95% CIs are dashed lines in Figure Fig 4a), depending on observations S (dots).

Development is nonlinear in state and in time. The connection between the development state h and the discrete states s is shown in Fig. 4b and c. The scale in Fig. 4b expands and contracts in h space to accommodate the relationship between underlying development h and the discrete observations – for this individual stage 5 occupies a larger portion of the h scale than do stages 2–4. The differences between posterior probabilities (solid lines in Fig. 4b) and prior probabilities (dotted lines) are important, because they show that the prior has been updated by data. Development is nonlinear not only on the developmental scale (Fig. 4b), but also on the time scale (Fig. 4c).
Using DIC for model selection, we arrived at a preferred model for each species (Table 2). All models included an intercept and temperature. Selected models for all but Magnolia included chilling units. Seed source was selected for all species having multiple seed sources, with the exception of Nyssa. Interactions involving temperature, chilling and seed source were included in selected models for seven of the 15 species.
The model was evaluated by predicting the entire course of observations for all individuals starting from dormant state 1 through fully expanded state 6 (Fig. 5). This is a more rigorous standard than the one-step-ahead prediction often used to evaluate fitted time-series models. There is large variability, but most 68% (1 standard deviation of the mean) and all 95% predictive intervals spanned the 1:1 line. Nonetheless, models for some species tend to over-predict the first stage or under-predict the last stage.

Parameter estimates were well resolved in the model. Effects of temperature (Fig. 6, 7a) and chilling units (Fig. 7a) differ from zero, with the exception of chilling units for Liquidambar. Both spring temperatures and chilling units have positive effects on (accelerate) development. The signs of temperature effects are reversed in Figs 6 and 7a, because parameters in Fig. 7a are transformed to permit comparison with degree-day regression models (Appendices S1–S5). Positive chilling parameters in Fig. 7a are consistent with a chilling requirement – a positive relationship between winter chilling and spring development rate.


Comparison with degree-day models for warming and chilling
To permit comparison of our CDM with traditional degree-day models, we transformed parameters in the CDM in terms of effects on days rather than development rate (the two have an inverse relationship – Appendix S5). The predictive interval includes parameter error, model error and individual variation.
The contrast between warming and chilling effects in the CDM vs the degree-day model is dramatic (Fig. 7). The CDM shows low correlation in predictive densities (parameter ellipses do not have a tendency to orient in a particular direction in Fig. 7a) as expected when there is an adequate distribution of data. Parameter estimates are well resolved, with clear warming and chilling effects, including differences between species. By contrast, the regression approach shows large correlation within and between parameter estimates (Fig. 7b).
Hypothesized effects of geography
Within a species, northern populations (unshaded densities in Fig. 6a) respond less to temperature than southern populations (shaded). In northern climates, populations break bud after their southern counterparts, but still achieve budbreak on fewer degree-days. In other words, plants originating from northern populations appear capable of completing development in a short interval within which fewer degree-days accumulate. Perhaps for this reason, seeds from northern populations tended to respond less to warming. The limited response of northern populations to warming is counter to the hypothesis that northern populations should be more responsive.
Hypothesized species and functional type differences
Phenology of Nyssa sylvatica most closely matched the prediction of DD models, that the advance in budbreak would be predicted by DD (Fig. 8). This is shown as the probability for state 3 (budbreak) plotted against DOY in the centre panels and against degree-days at right of Fig. 8. These are predictive mean probabilities for an individual exposed to the temperatures for an ambient and an elevated chamber, shown at left in blue and orange, respectively. Warming advances budbreak date (centre) to a degree predicted by the DD, indicated by the overlapping curves at right. Pinus strobus is typical of the remaining species, which show a large advance in timing (lower centre) that is not well predicted by DD (lower right).

Taken across all species, the ranking of response to warming or chilling was not related to xylem anatomy (Table 2). The ring-porous Quercus alba, Q. rubra and Fraxinus tended to be among the more responsive to warming, particularly southern populations of Quercus (Fig. 6a), but intermediate in response to chilling (Fig. 6b). The conifers Pinus palustris, P. resinosa, P. strobus and P. taeda were intermediate in response to both.
Discussion
One of the largest effects of warming on net primary production is expected to come from prolonged growing seasons, but warming will be uneven throughout the year, and counteracting effects of winter chilling vs spring warming frustrate prediction. Traditional degree-day models that are increasingly used to evaluate the impacts of climate change can misrepresent impacts of this variation. Degree-day models aggregate fluctuating temperatures in ways that hide or distort responses. They are essentially uninfluenced by onset of warming in spring, but assume instead a temperature threshold acting as a chronic penalty on development. The continuous development model (CDM) recasts development as a continuous response to fluctuating temperature informed by observations of discrete responses at discrete time intervals. In doing so, we circumvent the problem of explaining a discrete event like budbreak on the basis of climate aggregated over months. Clear species and population differences emerge from a model that incorporates realistic assumptions concerning observations and underlying process.
Why phenology models need to incorporate uncertainty
The effects of species, seed source, site, resources and weather variation over entire seasons combine to determine single events like flowering or budbreak. At a time, when global change scientists increasingly appreciate the heterogeneity of climate change, evidence for climate effects on phenology is increasingly influenced by meta-analyses of published results, where aggregation often precludes process-level modelling and uncertainty. Like survival analysis, phenology is especially prone to aggregation problems, because information is limited to time-to-event. Such analyses can be useful, provided the limitations are understood.
The discrete ordinal nature of data is likewise important. Increasing the number of states that are reported in data sets can benefit inference, for example six (this study), rather than two (see also Norby, Hartz-Rubin & Verbrugge 2003). However, the more states, the more important it becomes to allow for observation error, because it becomes increasingly difficult to consistently assign classes when there are more of them. As with any such classifications, the six developmental states we recognize are not linearly related to temperature or to the underlying continuous developmental rate (Fig. 4). There is no reason to expect a simple relationship in light of the fact that observed states are defined based on capacity to confidently identify them. The CDM finds the relationship between discrete observations in time and state that explains the changing development of large numbers of individuals. It benefits from large numbers of observations for temperature and response.
Warming effects on phenology
The large differences between species in response to warming and chilling identified here (Figs 6 and 7) are not predicted by functional type differences hypothesized in the literature. Of course, even a large experimental study like this cannot include enough species of each functional type combination to fully evaluate effects of xylem anatomy and successional status. Our results should be taken as a first assessment where the full effects of continuously varying temperature are considered. However, it would not be correct to interpret lack of correlation with warming response to mean that traits like xylem anatomy do not matter. Xylem anatomy may indeed limit responses of ring-porous species, but they compensate in other ways. More experiments are needed to fully evaluate these effects. The importance of this study is not to question that functional relationships exist (they must); rather it questions the existence of simple correlations between traits and phenology.
Contrary to the hypothesis that northern populations should respond most, results show that southern populations show the largest responses (Fig. 6). Chuine (2010) and Morin et al. (2010) suggest that northern range limits could be determined by inability to complete fruit maturation rather than killing frost, while southern limits result from the inability to break dormancy due to a lack of chilling temperatures. In the northern part of the range, populations might be prone to taking greater risk during warm springs. Our results showing the opposite trend may be explained by the fact that northern populations already develop later in calendar days, but earlier in degree-days. If populations in the north must develop rapidly, they may be unable to accelerate development further.
Offsetting effects of spring and fall warming
Cook, Wolkovich & Parmesan (2012) reported that vernalization sensitivity, which could result in a tendency for warming in fall/winter to delay spring phenology, was most common in the species that showed the least response to warming. The acceleration effects of a warm spring could be offset by the delay caused by reduced chilling in winter. If so, the species most sensitive to warming might be those showing no effect at all. In that study, a traditional degree-day calculation was applied to three-month windows of temperature data from fall through spring and compared with flowering times using regression. The analysis concentrates on two outcomes, a minority having a vernalization requirement, termed ‘divergent′, and a majority showing no such requirement. A vernalization response was only assigned to species where estimates had the correct sign, less than 20% of those tested. Thus, the non-responder category included the majority of species.
By applying both models to the same data, we demonstrate how the traditional degree-day model can be misleading. The expected responses are contained in the lower left (blue) quadrants in Fig. 7a,b, that is those for which both chilling units and spring warming advance phenology. Given that chilling requirements are known to operate in most species (e.g. Campoy, Ruiz & Egea 2011; Cooke et al. 2012), capacity to identify them is a basic test of phenology models. As in Cook, Wolkovich & Parmesan (2012) analysis, the degree-day model applied to our data fails this test by assigning to most species the wrong sign (Fig. 7b), suggesting the opposite of a chilling requirement. And most spring warming effects are not different from zero (horizontal axis in Fig. 7b). Both contradict known physiology.
But there is additional evidence in Fig. 7 that aids interpretation. The strong correlations in parameter estimates (many exceed 0·5 in Fig. 7b and evident in the 95% ellipses) and among species in Fig. 7b come from the correlations in winter and spring temperatures and, thus, in CU and DD. This results from the fact that warm winters tend to occur with warm springs. This is a standard feature of regression – parameter correlations are determined by the distribution of data. This correlation is unavoidable in the degree-day model, because temperature is aggregated over months. The loss of information and distortion that comes from aggregating an entire season of variation into a single number (Clark et al. 2011b, 2014a) is evident in the large number of non-significant warming effects in Fig. 7b (more than half of the warming effects straddle zero). Furthermore, the chilling effects mostly have the wrong sign – chilling is predicted to delay, not advance budbreak. In the CDM, correlations between fall and spring temperatures break down, because development responds to daily variation. Phenology is accelerated by both fall chilling and spring warming (Fig. 7a). Across broad taxonomic coverage (broadleaf deciduous, broadleaf and needleleaf evergreen, ring- and diffuse-porous, shade tolerant and intolerant) parameters are well resolved and have low correlation. All show negative warming coefficients, and only Liquidambar shows phenological delays due to chilling.
The distribution of data in regression explains the shapes of ellipses in the degree-day model, and the aggregation of temperatures explains how the order of species and signs of effects become distorted. Comparison of Fig. 7a and 7b further shows that the ordering of responses by the two methods differs substantially. If the degree-day method is detecting a positive correlation in responses to warming and chilling it appears not to be the correct one, and it is mostly predicting that fall chilling delays budbreak, contrary to a dormancy-breaking effect. Still, the CDM suggests a positive relationship between warming and chilling across species. This trend must be treated with caution. Although the CDM does not suffer from the aggregation of temperatures for spring warming, it still aggregates chilling temperatures over a defined period (before 1 January). This was done because we cannot identify from data when chilling stops and warming begins. Even in the CDM, we cannot rule out the possibility that correlation structure in data affects estimates. From our analysis, it is clear there is a chilling requirement that differs between species, but the precise relationship to warming effects requires experimental decoupling of temperatures at different periods of time.
How to anticipate uneven warming
How important is the finding that traditional degree-day models can be misleading? We expect the largest problems to occur where the pattern of warming is uneven. If, as in some agricultural experiments, temperatures do not vary, and we simply increase temperatures the same amount at all times, then poor predictive performance could be attributed solely to variation in developmental sensitivity over time. It is possible that aggregating temperatures into a single cumulative value may not introduce large errors in such cases. However, if warming is uneven, there is an interaction between developmental state and seasonality of warming. The interaction could produce surprises at least as extreme as documented in the comparison in Fig. 7.
Conclusions
By accommodating the effects of temperature variation, we show why traditional degree-day models provide limited insight. Development may not be closely linked to mean or cumulative temperature, because it responds to fluctuations in ways that depend on the developmental state. There is variation in response within and among years that does not survive the data aggregation from daily fluctuations to seasonal sums. Given that future warming will vary by season, it is important to move beyond the assumption that all temperatures have the same impact, regardless of developmental state.
Species attributes such as xylem anatomy and successional status can influence phenological response without resulting in trait correlations. Such correlations might be ‘significant′ in a sample of hundreds rather than 15 species. Our finding that they are not apparent in our modest sample of species indicates that knowledge of traits like xylem anatomy will likely not be a good predictor of phenology response over the range of variation in this study.
The fact that southern populations respond most to temperature could be viewed as evidence that residents will resist northward migration of immigrants as climate warms. If potential invaders fail to exploit a prolonged growing season to the same degree as residents, then there is a resident advantage. This possibility, combined with the observation that tree migration has lagged well behind climate change (Zhu et al. 2012), suggests further study of the role of phenology in tree migration.
Acknowledgements
The project was funded by the Department of Energy. For comments on the manuscript, we thank Matthew Kwit, Bradley Tomasek and three anonymous reviewers. For field assistance, we thank Lindsay Scott, Becky Roper, Maria Terres and Vicki Woltz. The authors claim no conflicts and no interests.