Decomposing phylodiversity
Summary
- Measuring functional or phylogenetic diversity is the object of an active literature. The main issues to address are relating measures to a clear conceptual framework, allowing unavoidable estimation‐bias correction and decomposing diversity along spatial scales.
- We provide a general mathematical framework to decompose measures of species‐neutral, phylogenetic or functional diversity into α and β components. We first unify the definitions of phylogenetic and functional entropy and diversity as a generalization of HCDT entropy and Hill numbers when an ultrametric tree is considered. We then derive the decomposition of diversity. We propose a bias correction of the estimates allowing meaningful computation from real, often undersampled communities. Entropy can be transformed into true diversity, that is an effective number of species or communities.
- Estimators of α‐ and β‐entropy, phylogenetic and functional entropy are provided.
- Proper definition and estimation of diversity is the first step towards better understanding its underlying ecological and evolutionary mechanisms.
Introduction
The species‐neutral approach of diversity measurement is based on Hill numbers, that is the effective number of species (Jost 2006). It is now being completed by far more interesting conceptual frameworks taking into account the species relatedness, that is either their functional or their phylogenetic proximity. This is what has been called, in the first case, ‘functional diversity’ (Tilman et al. 1997) and, in the second one, ‘phylogenetic diversity’ or ‘phylodiversity’ (Webb, Losos & Agrawal 2006). When both relative abundance and degree of relatedness between species (or individuals) are quantified, Pielou (1975) suggested that diversity measures should be generalized, integrating taxonomic differences between species. A little later, Rao (1982) proposed that the average of the species differences can be used as a measure of biodiversity. Despite some attempts to take into account taxonomic distinctness into a taxic diversity measure (Vane‐Wright, Humphries & Williams 1991), this ‘avant‐garde’ idea has been hardly applied in ecology (e.g. Warwick & Clarke 1995; Crozier 1997). During the last decade, increasing interests into the evolutionary history of communities (Webb 2000) as well as the need for conservation strategies taking phylogenetic risks into account (Faith 2008) revived the interest in phylodiversity partitioning.
Phylogenetic trees are built upon the genetic similarities among various biological individuals or other superior taxa. In a given local assemblage, phylogenetic diversity aims to quantify the evolutionary history shared among individuals since the time of the most recent common ancestor (Faith 1992; Chao, Chiu & Jost 2010). All else being equal, an assemblage of phylogenetically divergent species is often seen as more diverse than a local assemblage of closely related species (Vellend et al. 2010). There is an increasing interest to partition this phylogenetic diversity not only between local communities but also between time periods in order to elucidate community assembly rules (Pavoine, Love & Bonsall 2009) and investigate what is commonly called the phylogenetic structure of communities (e.g. Cavender‐Bares et al. 2004). For instance, Hardy & Senterre (2007) argued that a proper partitioning of phylodiversity is a necessary step prior to deciphering phylogenetic clustering (either due to local speciation of allopatric clades or habitat filtering of phylogenetically conserved traits) from phylogenetic overdispersion (allopatric speciation of two ancestral sympatric species, habitat filtering of phylogenetically convergent traits, competitive exclusion of related species).
Functional diversity was often defined as the extent of functional differences among individuals or species in a local community (Tilman 2001), an important determinant of ecosystem processes (Loreau et al. 2001). Functional diversity based on functional trees is a great tool to estimate the complementarity among individuals’ or species’ trait values by estimating their dispersion in trait space at all hierarchical scales simultaneously, avoiding discretization of continuous trait variation into functional groups (Petchey & Gaston 2002). Functional trees differ from phylogenetic trees as phylogenetic trees reflect evolutionary constraints whilst functional trees also take into account functional convergence (Hérault 2007). Each time a ‘proper’ functional tree can be constructed from a functional trait‐based distance matrix (Podani & Schmera 2007), it should be possible to estimate and partition functional diversity in a manner similar to phylogenetic diversity (Petchey & Gaston 2002). However, functional differences among species or individuals and, in fine, the functional diversity value itself will depend strongly on the a priori choice of important functional traits (Weiher et al. 1999).
In this paper, we consider that all individuals or species of a given local community are placed in an ultrametric phylogenetic or functional tree. The distance between two species is measured as the length of the branches between them and their first common node. Our methods apply regardless from which biological information and how the tree is constructed, but phylogenetic diversity is the main target, as we will discuss it. We will write phylodiversity and phyloentropy for short when presenting the methods, and phylogenetic or functional diversity when we are more specific. The last two terms are also existing measures of diversity, PD (Faith 1992) and FD (Petchey & Gaston 2002). We will show that they are special cases of our measures (Table 1) and we will write PD and FD explicitly when considering them.
| Diversity of order q | Special values of q | |
|---|---|---|
| Phylogenetic or functional entropy/diversity |
Entropy: Diversity: |
|
| Species‐neutral diversity |
Entropy: qH Diversity: qD |
0H + 1 is species richness 1H is Shannon entropy 2H is Simpson entropy |
Chao, Chiu & Jost (2010) generalized Hill numbers to measure phylodiversity. Pavoine, Love & Bonsall (2009) generalized HCDT entropy to measure phyloentropy (Shimatani 2001; Ricotta 2005 had already done it, but for Rao's quadratic entropy only). We first show here their equivalence: phyloentropy is transformed into phylodiversity the same way HCDT entropy is transformed into diversity sensu stricto. Then, we derive phylodiversity partitioning as a straightforward generalization of that of HCDT diversity. We discuss the difference between our approach and that of Chiu, Jost & Chao (2014). Finally, we provide estimation‐bias corrections for phyloentropy in order to obtain bias‐corrected measures of phylodiversity.
Partitioning phylodiversity
Tsallis Entropy
Tsallis entropy, also known as HCDT entropy (Havrda & Charvát 1967; Daróczy 1970; Tsallis 1988), has proven to be a powerful tool to measure diversity, generalizing the classical indices of diversity, including the number of species, Shannon and Simpson indices (Jost 2006). The order of diversity q gives more or less importance to rare species. Entropy can be converted into diversity sensu stricto (Hill 1973; Jost 2006), which is easy to interpret and compare. Statistical estimators of diversity measures are intrinsically biased because of unseen species and also because they are not linear functions of probabilities (Marcon et al. 2014a). This is a serious issue (Dauby & Hardy 2012; Beck, Holloway & Schwanghart 2013), even if some bias corrections are available for HCDT entropy estimators (Grassberger 1988; Chao & Shen 2003; Marcon et al. 2014a).
Species‐Neutral Diversity
. Community weights are wi: they may be equal to ni/n, but any positive values summing to 1 are allowed. Probabilities in the metacommunity depend on these weights:
. Diversity of the metacommunity is γ‐diversity. Diversity of local communities is α‐diversity. The formalism of deformed logarithms is appropriated: it allows elegant and intuitive algebra. The logarithm of order q is defined as follows:
(eqn 1)
(eqn 2)
(eqn 3)
(eqn 4)Last, diversity is the deformed exponential of entropy,
, and entropy is the deformed logarithm of diversity: qHγ = lnqqDγ.
Phyloentropy and Phylodiversity
Consider a phylogenetic or functional ultrametric tree (Fig. 1) partitioned into depth intervals delimited by slices passing through the internal nodes. Following Chao, Chiu & Jost (2010), the first slice starts at the bottom of the tree and ends at the lowest node. In slice k, Lk, leaves are found. The probabilities of occurrence of the species belonging to the branches that were below leaf l in the original tree are summed to give the grouped probability uk,l.

. We denote it as
:
(eqn 5)
is HCDT entropy in slice k. It is calculated as
.
(eqn 6)
(eqn 7)This relation is exactly the same as the relation between HCDT entropy and diversity. In other words, phyloentropy is the weighted average of entropy along the tree, and phylodiversity is the corresponding Hill number. Entropy is linear, it can be summed over slices, but diversity is not: phylodiversity is not the weighted average of diversity along the tree.
Decomposition
(eqn 8)
and
:
(eqn 9)
, and the contributions of local community i to α‐ and β‐entropy are
and
. This can be summed over slices and rearranged to obtain the decomposition of γ‐phyloentropy:
(eqn 10)
(eqn 11)Bias Correction
α‐ and γ‐HCDT entropies can be corrected following Marcon et al. (2014a). When q is low, unobserved species are the main issue that can be corrected according to Chao & Shen (2003). When q is high, the contribution of rare species to entropy is small, so the bias they cause is little, but entropy is less linear with respect to probabilities, requiring the correction of Grassberger (1988). The limit between low and high values of q is reached when both estimators are equal, empirically above q = 1 (Marcon et al. 2014a). Bias correction relies on the number of sampled individuals (probabilities are not enough) and can be computed for positive values of q. The unbiased estimators are denoted
instead of
. Their formulas are in Marcon et al. (2014a) and are not repeated here.
, relies on values of
, the bias‐corrected estimators of HCDT α‐entropy in slice k in local community i.
(eqn 12)Since the number of individuals in some leaves uk,l increases in slices close to the root of the tree, the bias decreases with k.
is calculated in the same way. β‐phyloentropy is obtained as the difference between
and
because Grassberger's correction is not available to allow direct calculation.
Example
We used the tropical forest data set already investigated by Marcon et al. (2012, 2014a). Two 1‐ha plots were fully inventoried in the Paracou field station in French Guiana. 1124 individual trees (diameter at breast height over 10 cm) have been sampled among 229 species. The phylogenetic tree was built introducing a rough taxonomy of the 229 species in the analysis: distance between species of the same genus is set to 1, 2 for different genera of the same family and 3 for different families. The functional tree was based on species relatedness using four key functional traits, each of them related to one axis of the leaf‐height–seed–stem economic spectra of tropical trees (Baraloto et al. 2010b): seed mass and tree maximum height (Hérault et al. 2011) plus specific leaf area and wood specific gravity (Baraloto et al. 2010a). The functional tree was built from a Gower's similarity matrix agglomerated using Ward's method (full details in Hérault & Honnay 2007). Diversity was calculated with the entropart package (Marcon & Hérault 2014) under r (R Development Core Team 2014): bias‐corrected entropy was calculated first, summed and finally transformed into diversity. Necessary R codes are in the supporting information, Appendix S1.
We first calculated the species‐neutral, phylogenetic and functional diversity of order 1 of the metacommunity (the two plots) and partitioned it (each plot is considered as a local community, weights are proportional to the numbers of individuals). The γ‐species‐neutral diversity (Hill number of Shannon entropy) is 134 effective species, partitioned into α‐diversity equal to 92 effective species (82 and 107 in each plot) and β‐diversity equal to 1·46 equivalent communities. Phylogenetic and functional diversity values, respectively, are:
= 55 and 5·9,
= 42 and 5·5 with
= 1·29 and 1·06. Considering the taxonomy of Paracou species, γ‐phylodiversity is around 2·5 times smaller than species‐neutral diversity. Functional diversity is only six equivalent species, showing an extreme redundancy according to the functional tree: FD (Petchey & Gaston 2002), that is functional diversity of order 0, is estimated equal to 18 whilst the number of estimated species is 297.
Since γ‐diversity is the product of α by β, they can be represented as nested rectangles (Fig. 2). The rectangle of size
by
has the same area as that of size
by 1. Plotting species‐neutral and phylodiversity together summarizes the essential information: the reduction of diversity due to the consideration of species phylogenetic or functional proximity.

Profiles (Fig. 3) can be drawn for species‐neutral, phylogenetic and functional diversities.

Discussion
Unification of Measures of Diversity
Phyloentropy generalizes many previous indices of diversity. Rao's (1982) quadratic entropy is phyloentropy of order 2 multiplied by T, the tree height. It has been explored in depth and several results obtained here were already known in this special case. It has been partitioned early by Rao himself, weighting communities according to their number of individuals, as Villeger & Mouillot (2008) whilst Hardy & Senterre (2007) or Pavoine et al. (2013) used equal weights. Hardy & Jost (2008) validated both weightings but a general framework allowing the additive partitioning of Rao's entropy was missing (Guiasu & Guiasu 2011). We showed that arbitrary weights are acceptable.
Other indices of diversity can be considered as special cases of phyloentropy (Table 1).
Alternative Partitioning
Chiu, Jost & Chao (2014) propose a different partitioning of phylodiversity (Chao, Chiu & Jost 2010) focusing on the independence between α and β components, following Jost (2007). It requires a particular definition of α‐diversity (in Chiu, Jost & Chao 2014; equation 6 for neutral diversity and 8 for phylodiversity), whilst we adopt Routledge's (1979) definition: α‐entropy is the weighted average entropy of communities, see equation 8. Chiu et al.'s approach is completely different from ours, as we will show it with a simple example. Consider N communities containing a single species, no species is shared between communities. Whatever q, the entropy of each community is 0, its diversity is 1 effective species. In our framework, α‐entropy equals 0 and α‐diversity is 1. More generally, whatever the weights and whatever q, if all communities have the same diversity, α‐diversity equals it.
In Chiu et al.'s framework, β‐diversity must be N since no species are shared, so α‐diversity is γ‐diversity divided by N. Species‐neutral α‐diversity of our example is not 1 but
. Community weights and species frequencies play a similar role: low‐weighted communities, as rare species, have a lower influence when q increases, and inversely, α‐diversity is driven by rare species of low‐weighted communities when q decreases. We consider in this paper that community weights are arbitrary, such as sampling unit sizes, so Chiu et al.'s α‐diversity is not suitable here.
We believe that Routledge's definition of α‐diversity is more appropriate. Entropy is the average information in each community so it can meaningfully be averaged between communities according to their weight to define α‐entropy. Adding an infinitesimal community (with weight close to 0) does not change the metacommunity's diversity, whilst it changes discontinuously in Chiu et al.'s framework (β‐diversity jumps from N to N + 1, for example).
The price to pay is α‐ and β‐diversities are not independent, as discussed more thoroughly in Marcon et al. (2014a). The real consequences of this dependence will have to be studied in depth.
Non‐Ultrametric Distances Between Species
Our framework relies on ultrametric trees, since entropy must be calculated slice by slice. Phylogenetic data are usually organized as a tree, but not necessarily ultrametric. Chao, Chiu & Jost (2010) calculate
as a sum over the branches rather than other slices of the trees, allowing them to address non‐ultrametric trees. Although it is defined mathematically, such a value of phylodiversity faces several issues. Pavoine & Bonsall (2009) discuss its inconsistency in the special case of q = 2, for example, the fact that the species distribution maximizing diversity is not unique then. Leinster & Cobbold (2012) show that the distance between species used to calculate
depends on species frequencies, questioning the very sense of what is measured. For these two reasons, we conclude that non‐ultrametric trees are not appropriate to measure phylodiversity in our framework, not only for technical issues (only ultrametric trees can be sliced to allow estimation‐bias correction) but for conceptual ones.
Functional diversity is more frequently calculated as a non‐ultrametric matrix of distances between species, whose transformation into a dendrogram causes deformations (Pavoine, Ollier & Dufour 2005). The choice of the clustering method influences the shape of the tree and may lead to inconsistent results (Podani & Schmera 2006), although appropriate methods, applied to the example above, reduce these issues (Podani & Schmera 2007). A more appropriate way to address functional diversity is probably using directly the distance matrix between species or its transformation into a similarity matrix. Similarity‐based diversity (Leinster & Cobbold 2012) may be preferred to evaluate functional diversity. We derive its decomposition and propose reduced‐bias estimators elsewhere (Marcon, Zhang & Hérault 2014b).
Conclusion
In this paper, we provide a general, consistent and operational framework to decompose measures of species‐neutral, phylogenetic or even functional diversity into α (within local communities) and β (between local communities) components. We show that entropy can be calculated and its estimation bias corrected in each slice of the phylogenetic or functional tree, summed over slices and finally transformed into diversity. In fact, phylodiversity can be analysed without using any species concept (i.e. diversity of individuals without categorizing them into a set of species) provided that phylogenetic or functional distance between individuals can be assessed, for example using molecular data or functional trait measured for each individual member of a metacommunity (Paine et al. 2011). Being able to properly partition phylodiversity is a necessary step towards deciphering the ecological and evolutionary mechanisms that underlie the structure and assembly of communities. Moreover, diversity partitioning will improve our assessment of human‐driven modifications of ecosystem functioning in conservation studies.
Acknowledgements
This work has benefited from an ‘Investissement d'Avenir’ grant managed by Agence Nationale de la Recherche (CEBA, ref. ANR‐10‐LABX‐0025). We thank Dr David Nipperess and an anonymous referee for their helpful comments and suggestions.
Data accessibility
The r scripts used to work the examples are available in the online supplement of the paper. They rely on the entropart package (Marcon & Hérault 2014) for r, which contains the data.
References
Citing Literature
Number of times cited according to CrossRef: 11
- Éric Marcon, Mesure de la biodiversité et de la structuration spatiale de l’activité économique par l’entropie, Revue économique, 10.3917/reco.703.0305, 70, 3, (305), (2019).
- Antton Alberdi, M. Thomas P. Gilbert, A guide to the application of Hill numbers to DNA‐based diversity analyses, Molecular Ecology Resources, 10.1111/1755-0998.13014, 19, 4, (804-817), (2019).
- Anne Chao, Chun‐Huo Chiu, Shu‐Hui Wu, Chun‐Lin Huang, Yi‐Ching Lin, Comparing two classes of alpha diversities and their corresponding beta and (dis)similarity measures, with an application to the Formosan sika deer Cervus nippon taiouanus reintroduction programme, Methods in Ecology and Evolution, 10.1111/2041-210X.13233, 10, 8, (1286-1297), (2019).
- Guillaume Odonne, Martijn Bel, Maxime Burst, Olivier Brunaux, Miléna Bruno, Etienne Dambrine, Damien Davy, Mathilde Desprez, Julien Engel, Bruno Ferry, Vincent Freycon, Pierre Grenand, Sylvie Jérémie, Mickael Mestre, Jean‐François Molino, Pascal Petronelli, Daniel Sabatier, Bruno Hérault, Long‐term influence of early human occupations on current forests of the Guiana Shield, Ecology, 10.1002/ecy.2806, 100, 10, (2019).
- Antton Alberdi, Ostaizka Aizpurua, Kristine Bohmann, Shyam Gopalakrishnan, Christina Lynggaard, Martin Nielsen, Marcus Thomas Pius Gilbert, Promises and pitfalls of using high‐throughput sequencing for diet analysis, Molecular Ecology Resources, 10.1111/1755-0998.12960, 19, 2, (327-348), (2018).
- Angela Lausch, Olaf Bastian, Stefan Klotz, Pedro J. Leitão, András Jung, Duccio Rocchini, Michael E. Schaepman, Andrew K. Skidmore, Lutz Tischendorf, Sonja Knapp, Understanding and assessing vegetation health by in situ species and remote‐sensing approaches, Methods in Ecology and Evolution, 10.1111/2041-210X.13025, 9, 8, (1799-1809), (2018).
- Thomas Denis, Bruno Hérault, Olivier Brunaux, Stéphane Guitet, Cécile Richard‐Hansen, Weak environmental controls on the composition and diversity of medium and large‐sized vertebrate assemblages in neotropical rain forests of the Guiana Shield, Diversity and Distributions, 10.1111/ddi.12790, 24, 11, (1545-1559), (2018).
- Antonio J. Fernández-González, Pilar Martínez-Hidalgo, José F. Cobo-Díaz, Pablo J. Villadas, Eustoquio Martínez-Molina, Nicolás Toro, Susannah G. Tringe, Manuel Fernández-López, The rhizosphere microbiome of burned holm-oak: potential role of the genus Arthrobacter in the recovery of burned soils, Scientific Reports, 10.1038/s41598-017-06112-3, 7, 1, (2017).
- Giovanna Jona Lasinio, Alessio Pollice, Éric Marcon, Elisa Anna Fano, Assessing the role of the spatial scale in the analysis of lagoon biodiversity. A case-study on the macrobenthic fauna of the Po River Delta, Ecological Indicators, 10.1016/j.ecolind.2017.05.037, 80, (303-315), (2017).
- S. T. Buckland, Y. Yuan, E. Marcon, Measuring temporal trends in biodiversity, AStA Advances in Statistical Analysis, 10.1007/s10182-017-0308-1, 101, 4, (461-474), (2017).
- Fabian Roger, Stefan Bertilsson, Silke Langenheder, Omneya Ahmed Osman, Lars Gamfeldt, Effects of multiple dimensions of bacterial diversity on functioning, stability and multifunctionality, Ecology, 10.1002/ecy.1518, 97, 10, (2716-2728), (2016).





equals PD (Faith
equals Hp, the phylogenetic generalization of Shannon's index (Allen, Kon & Bar‐Yam
equals Rao's quadratic entropy
