Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data

Coalescent-based species delimitation methods combine population genetic and phylogenetic theory to provide an objective means for delineating evolutionarily significant units of diversity. The generalised mixed Yule coalescent (GMYC) and the Poisson tree process (PTP) are methods that use ultrametric (GMYC or PTP) or non-ultrametric (PTP) gene trees as input, intended for use mostly with single-locus data such as DNA barcodes. Here, we assess how robust the GMYC and PTP are to different phylogenetic reconstruction and branch smoothing methods. We reconstruct over 400 ultrametric trees using up to 30 different combinations of phylogenetic and smoothing methods and perform over 2000 separate species delimitation analyses across 16 empirical data sets. We then assess how variable diversity estimates are, in terms of richness and identity, with respect to species delimitation, phylogenetic and smoothing methods. The PTP method generally generates diversity estimates that are more robust to different phylogenetic methods. The GMYC is more sensitive, but provides consistent estimates for BEAST trees. The lower consistency of GMYC estimates is likely a result of differences among gene trees introduced by the smoothing step. Unresolved nodes (real anomalies or methodological artefacts) affect both GMYC and PTP estimates, but have a greater effect on GMYC estimates. Branch smoothing is a difficult step and perhaps an underappreciated source of bias that may be widespread among studies of diversity and diversification. Nevertheless, careful choice of phylogenetic method does produce equivalent PTP and GMYC diversity estimates. We recommend simultaneous use of the PTP model with any model-based gene tree (e.g. RAxML) and GMYC approaches with BEAST trees for obtaining species hypotheses.


Introduction
Species are a fundamental unit for many fields of biology, yet their identification and delimitation are rarely straightforward (Hebert et al. 2003). Molecular techniques allow for rapid and broad assessment of diversity of poorly known groups or where traditional techniques are difficult (Blaxter 2003;Tang et al. 2012;Fontaneto 2014). Well-established metrics for species delimitation exist (see Sites & Marshall 2003;Birky et al. 2005;Flot, Couloux & Tillier 2010;Puillandre et al. 2012a) but only a few are grounded in evolutionary theory and do not require a priori hypotheses regarding species entities (O'Meara 2010; Yang & Rannala 2010). Fewer still are designed for large-scale single-locus marker surveys (Fujisawa & Barraclough 2013;Zhang et al. 2013). With the increased frequency of DNA taxonomy studies and their potential marriage with next generation sequencing technologies (NGS -Creer et al. 2010), there is a need to determine potential sources of bias on diversity estimates. Here, we evaluate robustness of the generalised mixed Yule coalescent model (GMYC) and the Poisson tree process (PTP) species delimitation methods to different approaches of phylogenetic reconstruction of the gene trees. Robustness was assessed by how topological and branch length variation introduced by phylogenetic methods influences delimitation estimates in terms of species richness and identity.
A special branch of phylogenetic species delimitation (see Sites & Marshall 2003) is coalescent-based species delimitation methods (Pons et al. 2006;Fontaneto et al. 2007;Zhang et al. 2013), which combine coalescent theory with diversification models to infer the transition point between population and species-level processes on a gene tree. These approaches provide objective, clade-specific threshold(s) with which to delimit evolutionarily significant units (ESUs) of diversity (akin to species, as defined by the Evolutionary Species Concept -Simpson 1951). These methods provide an alternative to operational taxonomic unit (OTU) picking methods, which rely on arbitrary, clade-specific sequence similarity thresholds ). *Correspondence author. E-mail: cuong.tang@imperial.ac.uk The GMYC is one of the most popular coalescent-based species delimitation methods and is designed for single-locus data (Fujisawa & Barraclough 2013; although it can be used with concatenated-loci data, Pons et al. 2006;Fontaneto et al. 2007) and has been used to describe new species (Birky et al. 2011). The method separately models the fit of Yule (pure birth;Yule 1925) and coalescent processes (Hudson 1990) to an ultrametric tree to define the transition from species-level to population-level processes, used to delimit ESUs. The PTP (Zhang et al. 2013) is a recently developed method that models speciation and coalescent events relative to numbers of substitutions rather than time, and uses heuristic algorithms to identify the most likely classification of branches into population and species-level processes, used to delimit ESUs. This approach assumes either that substitutions are clocklike or, if substitution rates vary across the tree, that coalescent and speciation events occur at a constant rate per substitution event, rather than per unit of time. The key advantage of the PTP, however, is that it is devised for non-ultrametric trees.
Several studies have evaluated factors that could bias accuracy of the GMYC and PTP. For the GMYC, simulation studies have addressed the effects of various aspects of sampling (Papadopoulou et al. 2008;Bergsten et al. 2012;Reid & Carstens 2012;Talavera, Dinca & Vila 2013), population size and speciation rates (Esselstyn et al. 2012;Fujisawa & Barraclough 2013). For the PTP, simulations have been used to evaluate the effect of birth rates (i.e. evolutionary distances between species) and sampling unevenness (Zhang et al. 2013). Less attention has been paid to the influence of different phylogenetic methods for reconstructing the underlying gene tree. For coalescent-based species delimitation, phylogenetic and branch smoothing (defined as methods that correct rate heterogeneity to make ultrametric, clocklike trees) methodology are potentially large sources of bias if branch length and topological variation is introduced by different phylogenetic methods, for example, by different treatment of unresolved nodes and rate heterogeneity. Zero-length branches introduce infinite (logarithmic) branching rate artefacts that might bias species delimitation and underestimate (early placement of the threshold) or overestimate (recent placement) species diversity (GMYC), and heterogeneity in the rate of molecular evolution among lineages would violate the assumption that branching events can be modelled against substitutions directly (PTP). It is well known that different methods of rate smoothing introduce variability in branch lengths (Drummond & Suchard 2010) that can ultimately affect inferences made from the tree (Rutschmann 2006); artificially variable branch lengths might therefore result in variable diversity estimates with the GMYC. A previous assessment of the effect of phylogenetic methods on GMYC ESU estimates showed that certain method combinations perform poorly (Talavera, Dinca & Vila 2013), but is not clear whether this is generally true.
The GMYC, in combination with at least 11 different phylogenetic and 9 smoothing methods (Table S1), has been used in over 150 studies. BEAST (Drummond & Rambaut 2007) is the most popular software for obtaining gene trees (48Á9%), followed by MrBayes (25% -Ronquist et al. 2012) and RAxML (8Á3% -Stamatakis 2006). BEAST is also the most popular software for rate smoothing (53Á3%), followed by r8s (28Á5% -Sanderson 2003), PATHd8 (6Á7% - Britton et al. 2007) and chronopl (5Á5% -Paradis, Claude & Strimmer 2004). It is not clear from the literature why one particular phylogenetic method is favoured. Is BEAST chosen (i) due to historical preference, (ii) because a posterior sample of trees is desired, (iii) because it does not require a post hoc rate-smoothing step, or (iv) because it provides more accurate species hypotheses than other methods? We address the latter issue for both the GMYC and PTP by systematically evaluating their performance given different phylogenetic methods across several data sets.
We evaluate the GMYC and PTP methods using cytochrome c oxidase subunit 1 (COI) data sets, first, where the species boundaries and diversity are relatively well known: cowries (Meyer & Paulay 2005), Drosophila spp. and Romanian butterflies (Dinca et al. 2011). Secondly, we compare the methods using 13 COI data sets of Rotifera, a phylum where the taxonomy is much less resolved, the sampling not as comprehensive, and where the benefit of DNA taxonomy is expected to be the greatest. We provide guidelines for maximising the robustness of species hypotheses based on single-locus data with respect to phylogenetic method.

D A T A S E T S A N D G E N E T R E E S
We compiled over 12 000 COI sequences forming 16 data sets (Table  S2), corresponding mostly to genera (Rotifera + Drosophila) but also a family (Cypraeidae [cowries]; Meyer & Paulay 2005) and a comprehensive geographical sample comprising several families (99% of Romanian butterfly species; Dinca et al. 2011). Tree reconstruction followed standard protocols (Data S1; Fig. S1): (i) align sequences with outgroups (Table S3) using MAFFT v6.814b (Katoh, Asimenos & Toh 2009), (ii) remove non-unique haplotypes (for comparability the same matrix was used for all analyses, although this step is not necessary prior to generation of BEAST trees, see Talavera (Britton et al. 2007) and r8s (Sanderson 2003).
The presence of unresolved nodes and rate heterogeneity was measured directly from the trees. For each non-ultrametric gene tree, rate heterogeneity was measured as the standard deviation of the root to tip distances, where a greater standard deviation signifies greater rate heterogeneity. Analysis of BEAST trees was used to quantify whether the different species delimitation methods lead to different diversity estimates also where there are no unresolved nodes.

S P E C I E S D E L I M I T A T I O N
The GMYC method with a single threshold (ST-GMYC), multiple thresholds (MT-GMYC; Monaghan et al. 2009;Fujisawa & Barraclough 2013) and a multimodel approach (MM-GMYC; Powell 2012; Fujisawa & Barraclough 2013) was applied to each ultrametric gene tree using the splits 1Á0-11 (Ezard, Fujisawa & Barraclough 2009) R package. PTP analyses were performed using its webserver (http://species.h-its.org/). For each clade, up to 25 different GMYC and 30 PTP estimates were made. Primarily, the PTP analysis was used with nonultrametric gene trees (PTP-raw: trees without post hoc smoothing, as intended by Zhang et al. 2013), but smoothed trees were also used (PTP-all: all trees) for a direct comparison with the GMYC input trees.
The deviance of each ESU estimate from the expected diversity was gauged using the absolute difference between observed (ESU X ) and expected (ESU expected ) diversity, standardised among data sets by dividing by the average diversity of the focal data set (ESU meanA : including the focal ESU estimate). ESU expected was either obtained from the morphological species count (ESU morph ) or the average of the species counts from across all trees (ESU meanB : excluding the focal ESU estimate). For the three data sets where the species boundaries have been better evaluated, the morphological species count was determined using either the GenBank species name (Drosophila and Romanian butterflies) or expert advice (cowries; C. Meyer pers. comm.). In the absence of a reliable taxonomic species count for the 13 Rotifera clades, ESU meanB was used as a conservative estimate of species richness. The use of ESU meanB as a proxy for ESU expected was validated by the relationship between the residual variation derived from ESU morph and from ESU mean for the non-Rotifera data sets (File S1). Residual variation was determined for each gene tree and species delimitation method (see File S2 for examples of the calculations).

S P E C I E S I D E N T I T Y
Correspondence between ESUs and ESU morph , in terms of species identity, was evaluated for the three non-Rotifera data sets. For each species delimitation estimate, the number of morphospecies that were split, lumped, or an exact match to an ESU morph was counted. Exact matches are where an ESU contains all species from a single morphospecies and no other. Morphospecies are split if they are found in more than one ESU and lumped if multiple morphospecies are present within a single ESU. These counts were performed for ST-GMYC, MT-GMYC, PTP-all and PTP-raw, but not for MM-GMYC because the method returns non-integers.

F A C T O R S I N F L U E N C I N G R E S I D U A L V A R I A T I O N O F S P E C I E S R I C H N E S S A N D L U M P I N G A N D S P L I T T I N G O F M O R P H O S P E C I E S
Generalised linear mixed models (GLMM; Bates et al. 2014) with a Poisson error structure were used to ascertain how residual variation varies with species delimitation methods, combination of phylogenetic and smoothing methods, rate heterogeneity, and presence of unresolved nodes. Three GLMMs were used to ask: For (i) all trees and (ii) BEAST trees, how does residual variation vary with species delimitation and phylogenetic methods? (iii) For trees with post hoc smoothing, how does residual variation vary with species delimitation and phylogenetic methods, presence of unresolved nodes and rate heterogeneity? For each of the models, residual variation was used as the response variable and clade was blocked out as a random effect. A generalised linear model (GLM) with a quasibinomial error structure was used to assess if the proportion of morphospecies that are an exact match to an ESU (response variable) differed among species delimitation methods, clades and combination of phylogenetic and smoothing methods (explanatory variables).
For each of the models, significant differences among the levels were identified using post hoc Tukey HSD tests (multcomp 1.3-1 R package -Hothorn, Bretz & Westfall 2008). All analyses were performed in R.

Results
For each of the 16 clades, ten gene trees (4x BEAST, MrBayes, GARLI, PhyML, RAxML, NJ and UPGMA) and 25 ultrametric trees were generated. MrBayes analysis of the cowrie data set (1459 tips) failed to converge, leading to a total of 159 gene trees and 396 ultrametric trees. These were analysed using the ST-GMYC (396 analyses), MT-GMYC (374 analyses) and MM-GMYC (286 analyses). For the cowries, only the ST-GMYC was performed owing to computational demands; MT-GMYC ran for over a week on a 3GHz processor with 8GB RAM without reaching a local likelihood optimum. The reduced number of analyses for the MM-GMYC is due to the method not accommodating trees with unresolved nodes without manual input (the logarithm of zero-length branches produces an infinite branching rate), which would not have been achievable within the scope of the present study. In total, 475 PTP and 1056 GMYC analyses were performed (Table S3).

S P E C I E S R I C H N E S S
For each data set, the number of ESUs estimated (non-Rotifera, Fig. 1a Fig. 1d-g) varied among species delimitation methods. For all data sets, the PTP estimates, especially PTP-raw, best matched the expected diversity (Table 1; Fig. 1d-g). GMYC estimates varied depending on whether one or multiple thresholds or a multimodel approach was used (non-Rotifera, Fig. 1a Table 1). The ST-GMYC estimates were more variable (non-Rotifera, Fig. 1a-c; Rotifera, Fig. S2) and differed more from the expected diversity ( Fig. 1d-g; Table 1). Reanalysis of these data without BEAST and UPGMA trees removes some of the significant differences associated with delimitation methods (Table 1). For the non-Rotifera data set, there are no differences among the species delimitation methods. For the Rotifera data set, ST-GMYC is the only significantly different method (Table 1). When only BEAST trees were analysed, there were no significant differences in residual variation among delimitation methods (Fig. 2d-e; Table S4).
Different combinations of phylogenetic and smoothing methods resulted in varied ESU estimates (non-Rotifera, Fig.  S3; Rotifera, Fig. S2) and residual variation ( Fig. 3; Fig. S4; Table S5). The tendency to under-or overestimate diversity relative to the mean varied randomly among trees smoothed with different methods (Table S6). Gene trees smoothed with the chronopl and chronos functions typically led to highly variable ESU estimates (non-Rotifera, Fig. S3; Rotifera, Fig. S2) that differed from the expected diversity ( Fig. 3; Fig. S4; Table S6).

U N R E S O L V E D N O D E S A N D R A T E H E T E R O G E N E I T Y
The proportion of unresolved nodes differed among gene trees (Fig. 2c), from none for BEAST trees to 24% for NJ and 43Á8% for MrBayes trees. Increased residual variation is related to the presence of unresolved nodes for Rotifera (GLMM: t = 5Á92, df = 1046, P < 0Á0001; Fig. 2b) but not non-Rotifera (GLMM: t = 0, df = 81, P = 1; Fig. 2a) clades and interacts with rate heterogeneity for both Rotifera (GLMM: t = À3Á27, df = 1046, P = 0Á0011) and non-Rotifera (GLMM: t = À3Á28, df = 81, P = 0Á0015) data sets.

S P E C I E S I D E N T I T Y
The proportion of morphospecies that were inferred as an ESU did not differ significantly among the different species delimitation methods ( Fig. 4; Table 1). Most of the combinations of phylogenetic and smoothing methods produced similar proportions of exact matches (Table S6; Fig. 4), but those smoothed with chronopl or chronos produced significantly lower proportions of exact matches, resulting from either higher levels of lumping or splitting. Differences were the largest among the data sets, with a significantly higher proportion of exact matches for the Romanian butterflies than for the cowries (GLM binomial : Z = 3Á39, P < 0Á001; Fig. 4), but no differences when compared to the Drosophila. The proportion of exact matches was on average 63 AE 2% and was highest for the Romanian butterflies (70Á4 AE 3Á5%), followed by Drosophila (59Á1 AE 2Á6), and cowries (58 AE 4Á4%; Fig. 4; Fig.  S5). The type of mismatches differed in proportion between the three data sets (Fig. S5): cowries were typically split, Drosophila were lumped (ST-GMYC, PTP) and split (MT-GMYC), and the Romanian butterflies were lumped.

Discussion
Good taxonomy is central to any discipline using species as a fundamental unit. Coalescent-based, phylogenetic species delimitation clusters sequences into evolutionarily significant units. This approach relies heavily on the underlying tree and is affected by the choice of phylogenetic methods (Talavera, Dinca & Vila 2013). Our results indicate that the PTP method produces ESU estimates that are more robust to phylogenetic reconstruction methods than the GMYC method, except when BEAST trees are used.
Specifically, residual variation in ESU estimates was lowest for PTP-raw. The three implementations of the GMYC method differed in how robust they were to phylogenetic methods (MM-GMYC>MT-GMYC>ST-GMYC). As expected, the MM-GMYC produced ESU estimates that were more robust to different phylogenetic methods, although the MM-GMYC estimate is typically an average. Species delimitation using both the PTP and GMYC methods was consistent (lower residual variation) for BEAST trees, possibly because they require no post hoc smoothing step and contain no unresolved nodes. In contrast, analysis of NJ trees resulted in particularly large deviations in ESU estimates, irrespective of smoothing method. This is not surprising given that NJ is a clustering method that does not rely on an evolutionary model (Saitou & Nei 1987), known to underperform if the distance measure is not a correct estimate of nucleotide substitutions (Tateno, Takezaki & Nei 1994). There is also a large increase in residual variation associated with chronopl and chronos branch smoothing, which are particularly prone to haphazard lumping and PTP-all = all trees and PTP-raw = trees without post hoc smoothing) are shown separately. The traditional species count (red, dashed line), median (thick, black lines), first and third quartiles (box), 1.5 times the interquartile range (whiskers) and outliers (circles) are shown. Letters above the boxes represent significantly different comparison, and n below the bars represents the number of species delimitation analysis that constitutes that bar. Table 1. Simultaneous pairwise Tukey HSD tests for general linear hypotheses. Differences in residual variation of ESU estimates between each delimitation method (Species richness) analysed separately for the non-Rotifera and Rotifera data sets, or the proportion of exact matches to the traditional species (Species identity) analysed for non-Rotifera data sets. ST-GMYC, MT-GMYC and MM-GMYC refer to GMYC using a single-, multiple-threshold and multimodel approach. PTP-all and PTP-raw refer to species delimitation analyses where either all of the trees were used, or only the trees that were not rate smoothed post hoc.
Analyses were for (1) all of the trees and (2)  splitting ESUs irrespective of the degree of between-branch smoothing (k) chosen (File S3). This finding concurs with that of Talavera, Dinca & Vila (2013) who found that GMYC analyses of NJ trees smoothed with PATHd8, chronopl or chronos produced aberrant ESU counts.
To quantify parameters that differ among trees and may affect ESU estimates, we analysed the effect of rate heterogeneity and unresolved nodes, which are either characteristics of poor tree reconstruction (methodological or sample issues) or real features of the data. We found a significant effect of both these parameters on the GMYC and PTP output: diversity estimates for trees with highly variable rates and/or unresolved nodes deviated more widely from the expected diversity than clocklike, resolved trees (e.g. BEAST trees). Branch smoothing of trees with highly variable substitution rates can lead to exaggerated stretching of branches (Drummond & Suchard 2010), which will detriment all coalescent-based species delimitation methods that use branch lengths as an input. Unresolved nodes in the tree impinge on correct diversity estimates because their resolution can lead to artefacts in the branch length data (e.g. infinite branching rates) that could result in misplaced coalescent thresholds used for delimitation. For the GMYC, splitting might occur if infinite branching rates are found closer to the tips, while for the PTP, it might result from increased average, observed intraspecific cohesiveness resulting from no increase in branch lengths with more tips. Contrarily, the diversity could be underestimated if the unresolved nodes are closer to the root for the reciprocal reasons. Whether unresolved nodes and rate heterogeneity in the data are correlates or causes of incorrect diversity estimates remains to be tested. Encouragingly, their effect is alleviated when BEAST trees are used as input.
As a measure of how species identity differed among the methods, we assessed the proportion of ESUs that were exact matches to traditional species (morphospecies). We found similar levels of species richness to the traditional taxonomy but varying levels of discordance in identity between the traditional and DNA taxonomy. The proportion of exact matches was on average 63%, and variation in this was associated primarily, with combination of phylogenetic and smoothing methods and taxonomic group, but less with species delimitation method. We found no significant differences in the proportion of exact matches between the species delimitation methods, although the PTP method was qualitatively higher. The largest differences were between the three clades, and potentially points to the varying levels of taxonomic work in these groups. These differences seem to be driven by aberrantly deviant ESU estimates (in terms of richness and identity) associated with the use of chronopl and chronos smoothing methods, which typically split the cowrie morphospecies and lumped the Drosophila and Romanian butterfly species. Traditional species of the Romanian butterflies appear to be supported by DNA taxonomy perhaps because the data set represents a geographical (rather than taxonomic) sample. Species are expected to appear more distinct in such a sample because the closest relatives of most sampled species will not be sampled (Bergsten et al. 2012). Although the proportion of morphospecies that were lumped, relative to split, indicates that lower intraspecific sampling in this clade is over-representing the Yule process in the tree and thus missing some of the ESUs. The higher intraspecific sampling for the cowries and Drosophila indicates that the splitting of these species could be associated with unresolved taxonomy (Packer et al. 2009) or overlapping intra-and interspecific variation (Meyer & Paulay 2005;Wiemers & Fiedler 2007). While efforts have been made to resolve the taxonomy of these groups (Meyer & Paulay 2005;O'Grady & Markow 2009), a more concerted effort is required to address the gap between DNA and traditional taxonomy across the entirety of these clades (C. Meyer pers. comm).
By assessing ESU counts across 16 data sets with over 1500 separate species delimitation analyses, we have shown that the PTP-raw model with any robust gene tree and the GMYC used on BEAST trees produce consistently robust and, on average, accurate species estimates. These findings can probably be extrapolated to other genetic markers: COI and 18S are typically used for animals (Tang et al. 2012, multiple markers (e.g. 16S) for bacteria Morlon et al. 2012), ITS for fungi (Powell et al. 2011) and multiple markers (e.g. matK and rbcL) for plants (Hollingsworth & CBOL Plant Working Group 2009). Although the variability of these markers will likely yield different degrees of coalescent clustering and species separation (Tang et al. 2012) that warrants a more thorough evaluation.
Coalescent-based species delimitation is likely to gain in popularity: either to facilitate the description of biodiversity in an integrative, iterative way as a tool to tackle the burgeoning taxonomic crisis (Puillandre et al. 2012b), or to cluster sequences from NGS studies (Creer et al. 2010;Chariton et al. 2014). The latter would benefit from evolutionary approaches that provide a deeper understanding of the nature and extent of diversity ). Applying coalescentbased species delimitation to NGS is currently limited by the amount of variability, the short (but ever increasing) read lengths, the amplification success of the markers used and the computational expense of the coalescent-based metrics. As with all DNA taxonomy studies, primers need to be designed to combat the low amplification success of certain primers (Zhan et al. 2014), robust bioinformatics pipelines need to be developed (S. Creer pers. comm.), and sampling regimes that are representative of intra-and interspecific variability and geographical range should be considered (Papadopoulou et al. 2008;Lohse 2009;Bergsten et al. 2012;Talavera, Dinca & Vila 2013).
The PTP method is appealing when speed is essential because ultrametric trees are not required (Zhang et al. 2013), meaning that some of the problems encountered and the additional computation required with branch smoothing may be circumvented. However, the PTP makes the assumption that branching events scale with substitutions rather than time, which might be violated when substitution rates are heterogeneous. The GMYC with a BEAST tree provided equally consistent estimates but obtaining BEAST trees is computationally expensive. However, when rate heterogeneity is high and can be adjusted across the tree estimation using models, perhaps by use of well-informed internal calibration priors, then diversity estimation might benefit from sophisticated dating and diversity estimation procedures. We feel that the GMYC is more true to the speciation process, in that speciation and coalescence happen over time and not necessarily in (a) (b) (c) Fig. 4. The relationship between the proportion of exact matches (morphospecies = ESU) and (a) species delimitation metric, (b) data set and (c) combination of phylogenetic and smoothing method. Each data set was analysed using eight different phylogenetic methods (grey shaded areas). Median (thick black lines), first and third quartiles (box), 1.5 times the interquartile range (whiskers) and outliers (circles) are shown. Letters above the boxes represent significantly different comparisons.
relation to how many substitutions occur in marker genes. While the transition between speciation and coalescent processes, used to delimit species, might be mirrored by differences in the number of species-and population-level substitutions (PTP), time is a more direct expression of the process; therefore, methods that separately model this transition (time; GMYC) and phylogenetic methods that formally correct for substitution rate variation among species (e.g. BEAST) are conceptually more appropriate. We recommend use of both PTP and GMYC methods with the appropriate phylogenetic tree or choosing between them on a case-by-case basis, bearing in mind the differences in speed and underlying theory inherent in the two methods. The PTP method with non-ultrametric trees is currently quicker to implement than the GMYC, especially the MM-GMYC, although the speed of the GMYC could be increased with parallelisation. Both the phylogenetic and species delimitation steps become computationally demanding for larger data sets (e.g. NGS studies). Such data sets, which are often taxonomically broad, are likely to violate use of a single substitution rate and so increased parameterisation and prior information is more likely to yield trees that better reflect the data and thus provide more realistic diversity estimates. We envisage that better phylogenetic handling of substitution rate heterogeneity within the samples, irrespective of delimitation method, and the use of ESU nodal support as a proxy for species identity confidence, would further improve the delimitation of primary species hypotheses from singlelocus marker surveys.

Supporting Information
Additional Supporting Information may be found in the online version of this article.      Table S1. Literature review of trees used as GMYC input and the types of phylogenetic reconstruction performed from 2006 to the 11th of April, 2014). Table S2. Sequences information including accession numbers, species description, publication, etc. Table S3. Dataset information (# Seq., #Hap., outgroups, residual variation, etc.).