Unifying fossils and phylogenies for comparative analyses of diversification and trait evolution
Summary
- The aim of macroevolutionary research is to understand pattern and process in phenotypic evolution and lineage diversification at and above the species level. Historically, this kind of research has been tackled separately by palaeontologists, using the fossil record, and by evolutionary biologists, using phylogenetic comparative methods.
- Although both approaches have strengths, researchers gain most power to understand macroevolution when data from living and fossil species are analysed together in a phylogenetic framework. This merger sets up a series of challenges – for many fossil clades, well-resolved phylogenies based on morphological data are not available, while placing fossils into phylogenies of extant taxa and determining their branching times is equally challenging. Once methods for building such trees are available, modelling phenotypic and lineage diversification using combined data presents its own set of challenges.
- The five papers in this Special Feature tackle a disparate range of topics in macroevolutionary research, from time calibration of trees to modelling phenotypic evolution. All are united, however, in implementing novel phylogenetic approaches to understand macroevolutionary pattern and process in or using the fossil record. This Special Feature highlights the benefits that may be reaped by integrating data from living and extinct species and, we hope, will spur further integrative work by empiricists and theoreticians from both sides of the macroevolutionary divide.
Introduction
Macroevolution is evolutionary change occurring at or above the species level (Stanley 1979). As implied by this broad definition, the study of macroevolution encompasses a range of evolutionary processes, including phenotypic change through time in a single lineage, speciation and extinction patterns in clades, and modes of phenotypic evolution during adaptive radiations. For many years, studies of macroevolution have lived in two distinct realms. Palaeontologists have used direct evidence from fossils to uncover long-term patterns in trait evolution and species diversification over geologic time-scales. At the same time, neontologists have used phylogenetic trees and statistical comparative methods to ask similar questions about the tempo and mode of trait evolution and diversification through time. Although there has always been some cross-talk between these two subfields (discussed below), the methodologies and some of the core questions addressed by palaeontologists and neontologists often differ. These differences have impeded progress in understanding the pattern and process of evolution over very long time-scales.
A few studies have successfully bridged the gap between macroevolutionary studies that use fossils and those that use phylogenetic trees. One approach is to apply statistical comparative methods to data that includes fossil taxa. This approach has a long history (e.g. Gingerich 1983, 1993; Cheetham 1986, 1987; Alroy 1998,1998 1999; Hunt 2006), but can be difficult, especially since most modern comparative methods require phylogenetic trees with branch lengths and good sampling at the species level. Another approach is to include fossil information in comparative analyses across phylogenetic trees of living species (Finarelli & Flynn 2006; Albert et al. 2009; Pyron & Burbrink 2012; Slater et al. 2012). Both of these approaches have great potential to add to our understanding of macroevolution in a way that spans both living and extinct taxa.
In this Special Feature, we have gathered a set of papers that seek to continue the merger of phylogenetic comparative methods and palaeontology. These papers are drawn primarily from palaeontologists and comprise a mixture of methodological and empirical studies. All are united by a common theme however: harnessing the power that comes from using phylogenetic approaches together with fossils to understand macroevolution.
Time-scaling phylogenetic trees
A time-calibrated tree underpins most modern phylogenetic comparative methods. Whether inferring diversification rate or mode of phenotypic evolution, we require some knowledge of the branching times and patterns of shared ancestry among taxa in our clade of interest. Great advances have been made over the past decade in methods for time-scaling phylogenetic trees. Typically, these approaches rely on molecular data for topology and branch length inference, with fossil taxa acting only as ‘calibration points’ for prior distributions on node ages. As such, most quantitative time calibration exercises have historically been limited to extant taxa. More recently, Pyron (2011) and Ronquist et al. (2012) have described ways of integrating fossil taxa as terminal nodes in these kinds of analyses using discrete cladistic data, and Felsenstein (2002) has suggested a similar approach for continuous characters. Such approaches have the potential to greatly improve access to comparative methods for palaeontologists, but at least two significant issues remain. First, when time calibrating a phylogeny that includes fossil taxa, taxon sampling reflects not only the macroevolution processes operating within the clade, but also sampling rates of fossils, which themselves may vary in space and time. Information on sampling rates is rarely integrated in macroevolutionary analyses, even though accommodating variation in them could have huge influence on model parameters, including divergence time estimates derived from simultaneous analysis of fossil and extant taxa. In this issue, Wagner & Marcot (2013) test the fit of probabilistic models of sampling rate distributions to occurrence data and show that allowing for distributed, rather than uniform rates, and for differently shaped distributions for taxa in different geographical regions does a superior job of explaining fossil finds. These results have important implications for a number of macroevolutionary questions, and Wagner & Marcot (2013) provide an enlightening demonstration by assessing divergence time estimates for Eocene-Oligocene carnivoramorphan mammals jointly using morphological, biogeographical and stratigraphic data. Approaches such as this will be undoubtedly become more and more important as researchers seek to integrate fossil taxa into time-calibrated phylogenies.
A second issue is that many phylogenies used in comparative analyses, particularly in palaeontological studies, are composite topologies or supertrees. These phylogenies are not based on primary data that can be used to derive empirical branch lengths. A number of methods have been proposed in the palaeontological literature to deal with such scenarios (Norell 1992; Smith 1994; Friedman & Brazeau 2011), but many have undesirable properties for macroevolutionary studies, such as a tendency to produce phylogenetic trees with zero-length branches (i.e. polytomies) or a lack of ways for accommodating uncertainty in node age estimates. Bapst (2013) describes a new approach for time-scaling palaeontological phylogenies that is implemented in his paleotree package (Bapst 2012). Named the ‘cal-3’ approach for its requirement of estimates for three rates (speciation, extinction and sampling), this time-scaling algorithm allows the user to generate distributions of time-calibrated trees over which macroevolutionary models can be fitted, and potentially allows for ancestral relationships rather than strict bifurcation. Bapst's (2013) approach allows much greater flexibility and more rigorous assessment of divergence times in palaeontological supertrees than existing methods can achieve and will hopefully lead to more robust macroevolutionary studies across a wider range of fossil clades.
Rates and modes of phenotypic change
Palaeontologists have provided evolutionary biology with an abundance of theories about tempo and mode in phenotypic evolution, from Simpson's notions of adaptive radiation via quantum evolution ( Simpson 1944, 1953) through Eldredge and Gould's criticisms of gradualism and evocation of punctuated equilibria to explain the rapid appearance of new phenotypes in the fossil record (Eldredge 1971; Eldredge & Gould 1972). Many of these conceptual models have been formalized by phylogenetic comparative biologists working on extant clades and used in combination with phylogenetic data sets to test against null models of gradual, rate-homogeneous evolutionary processes. While the models used in comparative biology are elegant, their appropriateness for data sets comprising extant taxa only is sometimes questionable (Slater et al. 2012). Furthermore, there is a tendency among comparative biologists to assume that many major evolutionary patterns, such as explosions in morphological disparity after mass extinctions, can be explained by modelling shifts in the underlying rate of phenotypic evolution (O'Meara et al. 2006; Thomas et al. 2006; Eastman et al. 2011; Venditti et al. 2011). Many palaeontologists might instead argue that better explanations for many of these phenomena involve shifts in the underlying evolutionary process, such as a shift from constrained evolution to unbounded evolution. This distinction is not trivial, as the effects of these two alternatives on realized disparity are quite different (Hunt 2012).
Two papers in this issue deal specifically with questions relating to quantitative trait evolution. In the first, Hunt (2013) expands on approaches derived in the phylogenetic comparative methods literature to test for relative contributions of anagenetic and cladogenetic change (e.g. Bokma 2008). Hunt's results highlight the difficulties of decomposing evolutionary change into anagenetic and cladogenetic components, even when data from fossil taxa are available. Intriguingly, Hunt also finds that the amount of phenotypic change apportioned to cladogenetic events varies depending on whether within-lineage evolution is modelled as Brownian motion or stasis. Importantly, stasis cannot be modelled without data from fossils. Punctuated equilibrium remains a controversial hypothesis, but Hunt's results suggest that without integrating palaeontological data into macroevolutionary modelling, we may never understand whether its expectations are met in real data.
The question of what kind of model provides the best test for an evolutionary scenario is also raised in the second trait evolution paper. Slater (2013) fits a series of novel evolutionary models to a comparative data set for living and fossil mammals to test for shifts in the mode of body size evolution after the extinction of nonavian dinosaurs at the Cretaceous–Palaeogene boundary. Similar to Hunt's conclusions, Slater suggests that previous phylogenetic tests of this hypothesis used models with assumptions that did not adequately reflect the hypothesis being tested. These two contributions provide compelling empirical examples of macroevolutionary hypotheses that can be tested with a decent phylogenetic/palaeontological data set. More significantly though, they highlight the importance of carefully considering the expected outcomes of an evolutionary process and providing a suitable macroevolutionary test of those expectations.
Speciation and diversification
Palaeontologists have a long tradition of studying diversity dynamics and the speciation and extinction rates accompanying them (Raup et al. 1973; Sepkoski et al. 1981; Raup & Sepkoski 1982, 19821984; Alroy et al. 2001). Phylogenetic comparative biologists have increasingly become interested in similar questions, and a range of methods now exist to test for constant or time-varying diversification rates using time-calibrated molecular phylogenies (reviewed in Stadler 2013). Expanding these approaches to include palaeontological data sets is slightly more challenging, but is also an active area of research (Stadler 2010; Didier et al. 2012). It is clear that this effort should be rewarding–diversification dynamics inferred from molecular phylogenies can sometimes directly conflict with the fossil record (Quental & Marshall 2010) and it is straightforward to show how such discrepancies might arise (Liow et al. 2010). In this issue, Ezard et al. (2013) tackle diversification dynamics from a slightly different angle, namely the relationship between rates of molecular evolution and number of speciation events along a lineage's evolutionary history. A positive correlation between rates of molecular evolution and clade diversity has been postulated before (e.g. Webster et al. 2003; Pagel et al. 2006; Venditti et al. 2006), but the idea remains controversial (e.g. Lanfear et al. 2010). Alternatively, increased speciation rates and rates of molecular evolution could both be driven by changes in life-history traits, such as body size or gestation time. Until now, the association between speciation events and the rate of molecular evolution has only ever been made on the basis of incomplete node counts (i.e. those derived from extant taxa in a molecular phylogeny). Ezard et al. (2013) take advantage of the rich fossil record and complete phylogeny of macroperforate planktonic forams to test this hypothesis using a complete fossil node count. Ezard et al. (2013) convincingly demonstrate that this question can be investigated most efficiently using palaeontological data.
Conclusions
In his preface to The Major Features of Evolution, G. G. Simpson declared himself neither a palaeontologist nor neontologist, but a practitioner of ‘the science of four dimensional biology, or of time and life’ (Simpson 1953, page xii). One cannot have a complete view of macroevolution without considering both the direct evidence of fossils and the detailed view of relationships and divergence times given by the tree of life. It is clear that students of macroevolutionary pattern and process can only benefit from a complete integration of palaeontological and neontological data and methods. We hope that this set of papers helps to further spur the merger of these two fields.
Acknowledgements
We would like to thank Editor-in-Chief Rob Freckleton, Assistant Editor Samantha Ponton and Journal Co-ordinator Graziella Iossa for allowing us to compile this Special Feature and for their help and guidance along the way. We also thank the authors who contributed papers to this issue and the reviewers who provided insightful comments and critiques to their manuscripts along the way.