Volume 109, Issue 2 p. 987-999
Free Access

The contribution of the edaphic factor as a driver of recent plant diversification in a Mediterranean biodiversity hotspot

Antoni Buira

Corresponding Author

Antoni Buira

Real Jardín Botánico (RJB-CSIC), Madrid, Spain


Antoni Buira

Email: [email protected]

Search for more papers by this author
Mario Fernández-Mazuecos

Mario Fernández-Mazuecos

Real Jardín Botánico (RJB-CSIC), Madrid, Spain

Search for more papers by this author
Carlos Aedo

Carlos Aedo

Real Jardín Botánico (RJB-CSIC), Madrid, Spain

Search for more papers by this author
Rafael Molina-Venegas

Rafael Molina-Venegas

GLOCEE—Global Change Ecology and Evolution Group, Department of Life Sciences, Universidad de Alcalá, Madrid, Spain

Search for more papers by this author
First published: 15 October 2020
Citations: 25


  1. The high diversification rates of plant lineages in the Mediterranean Basin hotspot have been linked to a complex interaction of climatic stressors, geographic isolation and soil type, but the question remains as to which of these factors has been the most significant environmental driver of recent speciation.
  2. Here, we draw on distributional data for the entire endemic flora of the Iberian Peninsula, together with DNA-based phylogenies and spatial phylogenetic methods, to explore the patterns of relative phylogenetic endemism at different spatial resolutions and phylogenetic scales (superclades) and assess how environmental factors contribute to explain these patterns.
  3. We found that recent diversification of angiosperms as a whole, and particularly of eudicots, has been boosted by environmental stressors including high values of soil pH and dry-seasonal climatic conditions, while diversification of monocots has not been associated with soil conditions but with high elevation and less seasonal climate.
  4. Synthesis. These results provide robust insights into the environmental factors driving recent plant diversification in the Mediterranean Basin, including a role of soil properties that had not been quantified before. The contrasting environmental drivers of diversification in eudicots and monocots highlight the importance of analysing spatial phylogenetic patterns at multiple phylogenetic scales to get a better understanding of the processes that shape biodiversity.


With nearly 22,500 plant species and 11,700 endemics, the Mediterranean Basin is the third major plant biodiversity hotspot of the world (Mittermeier et al., 2004). More than half of all plant species occurring only in the Mediterranean region are narrow endemics (i.e. they are unique to a well-defined small area; Thompson, 2005), which are primarily represented by perennial herbs and dwarf shrubs adapted to stressful habitats and largely clustered in species-rich lineages of recent origin (Buira, Cabezas, et al., 2020; Lavergne et al., 2004; Verlaque et al., 1997). Such exceptional concentration of narrow endemics has not gone unnoticed by biogeographers and evolutionary biologists, who have long emphasized the value of Mediterranean biomes for understanding the mechanisms underlying the generation and maintenance of plant biodiversity (e.g. Cowling et al., 1996; Médail & Verlaque, 1997; Molina-Venegas et al., 2013; Quézel & Médail, 2003).

It has been argued that high diversification rates in the Mediterranean are linked to a complex interaction of climatic stressors, geographic isolation and soil type (Molina-Venegas et al., 2013; Rundel et al., 2016). In particular, it is well-known that intensification of summer drought since the Pliocene (Suc, 1984) was a crucial stimulus for recent diversification of some lineages (Verdú & Pausas, 2013). Likewise, the rugged topography of the territory provides high heterogeneity of environmental niches and climatic refugia in geographic isolation, which fostered local speciation (Comes & Kadereit, 2003; Smyčka et al., 2017) and enabled lineages to survive periods of climatic oscillation such as the glacial–interglacial cycles of the Pleistocene (Médail & Diadema, 2009; Ohlemüller et al., 2008).

The association between soil type and endemism (‘edaphism’) was recognized by early Mediterranean botanists (e.g. Braun-Blanquet, 1932; Rivas Goday, 1969; Willkomm, 1852) who pointed that (a) most endemics show either a strong preference for, or avoidance of, calcareous substrates and (b) many narrow endemics are closely linked to stressful substrates such as gypsum or dolomites (see also Escudero et al., 2015; Mota et al., 2017). However, the evidence connecting plant diversification and soil properties has hitherto been mostly narrative (but see Lobo et al., 2001; Molina-Venegas et al., 2013). The question remains as to whether climate, topography or soil has been the most significant environmental driver of recent speciation in the Mediterranean (Rundel et al., 2016).

In recent years, a family of methods collectively referred to as ‘spatial phylogenetics’ has emerged as a promising tool to evaluate spatial patterns of biodiversity from an evolutionary standpoint (e.g. Scherson et al., 2017; Thornhill et al., 2016, 2017). Specifically, Mishler et al. (2014) developed the concept of relative phylogenetic endemism to enable identification of significant centres of recent diversification by analysing the flora of a given geographic area using distributional data and DNA-based phylogenies. While this novel approach has been used in combination with climatic and topographical data to evaluate potential environmental drivers of recent diversification in some Mediterranean-type floras (Molina-Venegas et al., 2017; Thornhill et al., 2017), the role of edaphic conditions remains unexplored.

The Iberian Peninsula, in the western Mediterranean, is one of the two most species-rich areas of the Basin (together with the Anatolian Peninsula in the east; Médail & Quézel, 1997). It accounts for almost a quarter of the area of the Mediterranean Basin biodiversity hotspot and is home to nearly half of all European plants (Aedo et al., 2017). Thus, the Iberian Peninsula represents an ideal eco-historical setting to delve into the drivers of plant diversification in Mediterranean ecosystems. Here, we draw on distributional data for the entire endemic flora of the Iberian Peninsula, together with DNA-based phylogenies and spatial phylogenetic methods to (a) explore the patterns of phylogenetic endemism as a means of detecting centres of recent diversification; (b) assess how edaphic, climatic and topographic factors contribute to explain the spatial distribution of these centres and (c) test whether the patterns are consistent across plant superclades (i.e. monocots, eudicots, superasterids and superrosids). We hypothesize that soil properties may be at least as relevant as climate and topographic features in determining the spatial distribution of centres of recent plant diversification in the Iberian Peninsula.


2.1 Study area

The study area (Figure 1) comprises the Iberian Peninsula (continental Spain, continental Portugal, Andorra and a narrow strip of land in southern France) and the Balearic Islands, which are a continuation of the Baetic System (southeastern Iberian Peninsula). The total extent of the study area is 598,830 km2. Endemic species richness (1,357 species in total; Buira, et al., 2020) is unevenly distributed across the territory, the Baetic System being the richest region, with nearly 40% of all Iberian endemics (Buira et al., 2017).

Details are in the caption following the image
Map of the Iberian Peninsula and the Balearic Islands showing the main geographic features. Colour indicates the predominant lithology: soils developed on acidic, siliceous rocks in yellow–blue, and soils derived from calcareous, basic materials in orange–red. The black line indicates the limits of the study area

Old siliceous materials of the Hercynian basement, such as plutonic (granites) and metamorphic rocks (gneiss, quartzite, slate and schist), emerge extensively in the western half of the Peninsula and in the core of major mountain ranges. Sedimentary materials such as limestones, marls and evaporites predominate in eastern mountain ranges and plateaus. These geological features determine a major lithological divide between western acidic and eastern basic substrates (Figure 1). Two main bioclimatic regions can been distinguished. The Eurosiberian region covers a narrow strip in the north and northwest, including the Pyrenees and the Cantabrian Mountains, while the rest of the Peninsula and the Balearic Islands are included within the Mediterranean region, with a marked summer drought (Rivas-Martínez, 1987).

2.2 Endemic species list and distributional data

We used the species list from Buira, et al. (2020), which is based primarily on Flora iberica (Castroviejo, 1986–2019) and includes all angiosperm plant species endemic to the study area (see Figure 1). Gymnosperms and pteridophytes were not included in the study because (a) endemics of these groups are comparatively rare in the study area (<0.5% of all endemic species) and (b) the much older ancestor of all vascular plants compared to that of the angiosperms would have obscured phylogenetic patterns among the latter (Letcher, 2010; Qian et al., 2017). After removing taxa of dubious taxonomic validity or with deficient distributional data (10%), a total of 1,252 angiosperm species were included in this study.

Distributional data were obtained from the Anthos (www.anthos.es) and Flora-On (www.flora-on.pt) databases, which compile about 60,000 unique records on UTM 10 km × 10 km grid cells for the target species. Additionally, data for the northern side of the Pyrenees were complemented using the Atlas of the Pyrenean Flora (www.florapyrenaea.com). Several steps of data quality checking were conducted to remove potential errors (see Buira et al., 2017 for details). Despite the large amount of distributional data available, ranges of many endemics might still be substantially incomplete at 10-km resolution, which may have an effect on the analyses (Thornhill et al., 2017). Thus, in addition to 10 km × 10 km grid cells, all analyses were carried out using 50 km × 50 km grid cells to avoid potential biases caused by incomplete sampling (see Supporting Information). After excluding cells with no distributional data, the total extent consisted of 4,440 and 255 cells at 10- and 50-km resolution respectively.

2.3 Phylogenetic tree

We used the species-level time-calibrated global phylogeny published by Smith and Brown (2018). This phylogeny includes all taxa with DNA sequences available in GenBank up to date of publication, and the phylogenetic inference was constrained to the backbone topology provided by Magallón et al. (2015). Instances of synonymy between species names in the global phylogeny and those in our list were detected following the taxonomic criteria in The Plant List website (www.theplantlist.org). After pruning the global phylogeny to include only the species that were in our list, 55% of Iberian endemics were still missing from the phylogeny. To include these taxa in the tree and account for uncertainty in their phylogenetic placement, we used a randomization approach (Rangel et al., 2015). Most missing species were added at random to the crown group of their corresponding genera using the add.random function in phytools r package (Revell, 2012). In the few cases in which the genus was missing from the global phylogeny (only 14 monospecific genera), we used published phylogenetic evidence to constrain the randomization scheme. For example, molecular studies supported that Prolongoa, Hymenostemma and Castrilanthemum (missing in the global phylogeny) are affiliated to the monophyletic Asteraceae subtribe Leucanthemopsidinae (Oberprieler et al., 2007), and thus their species were added randomly to the crown group of this clade in the phylogeny. This randomization procedure was repeated iteratively to obtain n = 1,000 alternative topologies, which were used to compute averaged values for phylogenetic metrics (see below).

2.4 Taxonomic and phylogenetic endemism metrics

Firstly, we calculated two taxonomic diversity metrics, that is, endemic richness (ER) and weighted endemism (WE), for each grid cell. While ER is simply the number of Iberian endemic species in a given cell, WE is the sum of the inverse range sizes (i.e. 1/number of grid cells) of the endemic species occurring in a grid cell (Crisp et al., 2001; Linder, 2001).

In analogy with WE, phylogenetic endemism (PE) is the sum of the range-weighted phylogenetic branches present in a given grid cell (Rosauer et al., 2009), which in turn is used to compute relative phylogenetic endemism (RPE; Mishler et al., 2014). RPE is the ratio between PE measured on the actual tree and PE measured on a comparison tree that retains the actual tree topology but makes all branches of equal length (Mishler et al., 2014). Thus, while high RPE values indicate over-representation of long range-restricted branches (suggesting the presence of palaeoendemic lineages, sensu Stebbins & Major, 1965), low RPE values indicate over-representation of short-rare branches, which can be interpreted as evidence of recent local diversification (i.e. neoendemism, Mishler et al., 2014).

Previous studies have used RPE to detect centres of palaeo and neoendemism in a certain region of interest using either the species of a target group that are endemic to the region (e.g. Mishler et al., 2014; Molina-Venegas et al., 2017) or entire floras regardless of the endemic status of the species (Scherson et al., 2017; Thornhill et al., 2017). Both approaches have advantages and limitations. On the one hand, using entire floras rather than endemic species may obscure or even confound patterns of recent in situ diversification, especially when many species that are widely distributed outside the study area occur in just a few localities within the study area (e.g. the Pyrenees mountains represent the southernmost distribution limit for many widespread European temperate species). On the other hand, using only endemics may result in overly long terminal phylogenetic branches if the closest relatives of certain endemic species are excluded from the analyses due to their non-endemic status (i.e. the most recent common ancestor of the endemic species and its closest relative might seem older in the phylogeny than it really is). The latter problem is also present, although to a lesser extent, when analysing entire floras, because the closest relatives of some non-endemic species may also be absent from the study area. Therefore, caution must be exercised when interpreting centres of palaeoendemism. In this study, we aimed at identifying the areas of recent in situ diversification (neoendemism) in the Iberian Peninsula, and therefore we focused on the species that are restricted to the region. Note that, unlike previous studies that also explored RPE patterns, we did not aim at categorizing cells into discrete classes on the basis of observed p-values (see the CANAPE framework in Mishler et al., 2014). Instead, we simply ranked cells based on a continuum of RPE values, which enabled the calculation of correlations with environmental variables (see below).

In order to get a better insight into RPE patterns, all analyses were conducted at multiple phylogenetic extents (Graham et al., 2018), including all angiosperm species (RPE-ANG) and four nested monophyletic superclades: (a) monocots (RPE-MON; 9.5% of all angiosperm species), (b) eudicots (RPE-EUD; 90.5%), (c) superasterids (RPE-SAS; 61.5%) and (d) superrosids (RPE-SRO; 26.3%).

2.5 Environmental data

In the first step, we calculated for each cell several environmental variables belonging to three categories:

  • Climatic, including means of annual temperature and annual precipitation, temperature seasonality and precipitation seasonality; calculated from WorldClim 2.0 layers at 1-km spatial resolution (www.worldclim.org; Fick & Hijmans, 2017).
  • Topographic, including means and standard deviations of elevation, slope, roughness and aspect variables; calculated from a digital elevation model (DEM) using terrain analysis tools of QGIS 3.4.
  • Edaphic, including mean and standard deviation of soil urn:x-wiley:00220477:media:jec13527:jec13527-math-0001; calculated from the map of soil pH of Europe at 5-km resolution (Reuter et al., 2008). In the Iberian Peninsula, low (acidic) pH values typically correspond to soils developed on siliceous rocks while high (basic) values correspond to soils developed on carbonate or evaporite sedimentary rocks (see Figure 1).

In the second step, we explored individual correlations between response variables (i.e. ER, WE and RPE) and environmental variables. Only variables showing a significant correlation with at least one response variables were selected. Multicollinearity was prevented by testing correlations for each possible pair of explanatory variables. If correlation coefficient was >0.7, the variable that showed the highest correlations with the other explanatory variables was excluded from the analyses. The explanatory variables that remained after the selection process are shown in Table 1.

TABLE 1. Environmental variables and their abbreviations used in this study grouped by category. The maximum and minimum values correspond to those calculated in 10 km × 10 km grid cells. Variables marked with an asterisk (*) were only significant in the analyses at 50-km resolution
Category Variable Abbrev. Unit Min. value Max. value
Climatic Annual precipitation mean Precip mm 223 1608
Precipitation seasonality mean P_Seas % 12 77
Temperature seasonality mean T_Seas % 27 70
Topographic Elevation mean Elevat m a.s.l. 0 2,597
Slope angle mean Slope º 0 18.5
Aspect (compass direction) mean* Aspect º 0 301.4
Aspect standard deviation* Asp_SD 0 167
Edaphic Soil urn:x-wiley:00220477:media:jec13527:jec13527-math-0002 mean Soil_pH 3.5 7.8
Soil urn:x-wiley:00220477:media:jec13527:jec13527-math-0003 standard deviation* Soil_SD 0.5 7.5

2.6 Multiple linear regressions and variation partitioning

Multiple linear regressions were used to model the relationship between the environmental variables selected in the previous step and taxonomic and phylogenetic endemism. As taxonomic metrics (ER and WE) were derived from overdispersed count data, they were log-transformed to meet parametric test assumptions. Although model reformation is recommended over data transformation for the analysis of count data (St-Pierre et al., 2018), we chose the latter option for two reasons. First, the outcomes of the test statistic and residuals versus fit plots were very similar using negative binomial regressions and linear regressions with data transformation. Second, variation partitioning and relative importance of regressor methods (see below) are only implemented for linear models. In order to mitigate the problem of under-sampled cells in the analyses at 10-km resolution, we fixed a threshold of at least 190 recorded species per grid cell (including endemic and non-endemic species), the mean species richness in the dataset (ranging from 2 to 1,500). As a result, the total number of observations for the regressions of taxonomic metrics at 10 km was 1,440 (32% of all grid cells).

Relative phylogenetic endemism values (RPE-ANG, RPE-EUD, RPE-MON, RPE-SRO and RPE-SAS) followed log-normal distributions, and they were also log-transformed before applying multiple linear regressions. In this case, we included all the observations, and the total numbers of endemics per grid cell and superclade were used as weights in the regressions to give greater importance to grid cells with larger amounts of data, as poor cells may not represent the centres of diversification despite low RPE values. All regressions were checked to ensure that model assumptions were met.

Variation partitioning (Borcard et al., 1992) was applied to assess the overall contribution of groups of variables (i.e. edaphic, climatic and topographic) to explaining taxonomic diversity and relative phylogenetic endemism patterns. Input data for variation partitioning (modEvA r package; Barbosa et al., 2016) are the coefficients of determination (adjusted R2) of the response variable on all the explanatory variables in the full model, on the explanatory variables in each particular group, and on the explanatory variables in each pair of groups. The outputs are the amount of variation attributable purely to each given group of variables—climatic, topographic and edaphic—and the amounts of shared variation attributable to two or three groups. Fractions of variation can sometimes take negative values because two groups of variables may explain the response variable better than the sum of the individual effects of those two groups of variables (Legendre & Legendre, 2012), and these negative values should be interpreted as zero (Legendre, 2008).

Additionally, the relative importance of each individual explanatory variable was estimated for each model following Johnson and Lebreton's (2004) criterion. That is, by measuring the proportionate contribution of each predictor to R2, considering both its direct effect on the dependent variable and its effect when combined with the other variables in the linear regression. In particular, the lmg metric (relaimpo r package) was used, which is the R2 contribution averaged over orderings among regressors. This approach provides a decomposition of the variance explained by the model into non-negative contributions (Grömping, 2006).


3.1 Taxonomic diversity metrics

The higher ER values occurred mostly in the main mountain ranges (Figure 2a). Moderately high values also occurred in some coastal areas of the southern half. Weighted endemism follows a quite different pattern, with high values occurring almost exclusively in the Baetic Mountains and the Balearic Islands (Figure 2b). The overall explanatory power of linear regressions was near 50% for both ER and WE at 10-km resolution (Figure 2c,d). Variation partitioning and relative importance of variables revealed that topographic variables—elevation and slope—are by far the most important in explaining ER (Figure 2c). They were also very significant in WE models, but climatic (particularly precipitation seasonality) and edaphic factors gained importance in explaining this metric (Figure 2d). Thus, results suggest that rugged mountains with seasonal rainfall and calcium-rich substrates are more likely to harbour narrow endemics. Results using a spatial resolution of 50 km were similar (see Figure S1); however, values of R2 were slightly higher (particularly for WE), climatic variables had a lower relative importance and additional topographic (aspect) and edaphic (standard deviation of soil pH) variables contributed to explain both diversity metrics. In particular, the contribution of the latter variable suggests that heterogeneity of substrates increases endemic richness; nevertheless, its relative importance was rather low.

Details are in the caption following the image
Taxonomic diversity metrics for the Iberian flora at 10-km resolution. (a) Spatial distribution of endemic richness (ER). (b) Spatial distribution of weighted endemism (WE). (c, d) Variation partitioning assessing the contribution of edaphic, climatic and topographic groups of variables and relative importance of individual variables (symbols + or − indicate the sign of correlation) in explaining (c) ER and (d) WE at 10-km spatial resolution. Individual explanatory variables with a contribution <0.03 are not represented

3.2 Spatial RPE patterns

The impact of phylogenetic uncertainty on RPE was rather low. At 10-km resolution, the mean coefficient of variation of the 1,000 topologies for all angiosperms (RPE-ANG) was 16.5%, and 75% of grid cells had a coefficient of variation below 20.2% (see Table S1). The variation was similar but slightly lower for the rest of groups, and it was significantly lower for all groups at 50-km resolution (e.g. the mean of variation of RPE-ANG was 8.2%).

Spatial patterns of RPE for all angiosperms (Figure 3e) show that mountain ranges of northern Iberian Peninsula (Pyrenees, northern Iberian System, Cantabrian Range and northern Portuguese mountains) and central-western mountains (part of Central System) have, on average, higher values of RPE. On the contrary, low RPE values prevail in eastern and southeastern mountains (e.g. Iberian System and Baetic System), although some cells of the northern (Sierra de Cazorla) and the western tips of the Baetic Mountains display high RPE. RPE is also high in the two eastern Balearic Islands, while it varies considerably across the southwestern coast of Iberia.

Details are in the caption following the image
Spatial distribution of relative phylogenetic endemism (RPE) of the Iberian flora at 10-km resolution. (a) Eudicots (RPE-EUD). (b) Monocots (RPE-MON). (c) Superrosids (RPE-SRO). (d) Superasterids (RPE-SAS). (e) All angiosperms (RPE-ANG). Diagrams on the left of figures a-d show the size and position of each superclade within the phylogenetic tree of the Iberian endemic flora. (f) Correlation matrix among RPE values for all groups; values in bold are statistically significant. Classes of RPE values are visualized by quantiles to make comparisons among groups easier; only cells containing at least three endemics within each clade are represented

Spatial patterns of RPE were largely similar for all superclades except the monocots (Figure 3a,b,c,d; see also Figure S2). For the latter, low RPE values are mostly located in northern high mountains, while cells with high values are scattered mostly in the western half of the Iberian Peninsula and Mallorca island (Figure 3b). All pairs of superclades, except those including the monocots (RPE-MON), displayed positive and significant spatial correlation of RPE values between them (Figure 3f). superrosids and superasterids displayed the lowest significant correlation (0.19) but, unlike the other clade pairs showing significant correlations, they are independent (non-nested) clades. RPE-MON was only significantly correlated (0.32) with RPE-ANG, indicating that monocots had a weighty effect on the estimated RPE for all angiosperms. It is important to note that endemic monocots are very scarce in areas of low RPE-EUD, which partly explains the total lack of correlation between RPE-MON and RPE of the other superclades (RPE-EUD, RPE-SAS and RPE-SRO).

Regressions of RPE into environmental variables at 50-km resolution had significantly greater explanatory power than those at 10 km (Figure 4; Figure S3; Table S2). At both resolutions, models of RPE-EUD had the highest explanatory power (R2 = 0.61 at 50 km and R2 = 0.34 at 10 km), followed by RPE-ANG, RPE-SAS, RPE-SRO and RPE-MON. The relative variation partitioning by groups of variables and the relative importance of individual variables were also very similar at both spatial resolutions.

Details are in the caption following the image
Variation partitioning assessing the contribution of edaphic, climatic and topographic groups of variables and relative importance of individual variables in explaining relative phylogenetic endemism (RPE) at 10-km spatial resolution. (a) Eudicots (RPE-EUD). (b) Monocots (RPE-MON). (c) Superrosids (RPE-SRO). (d) Superasterids (RPE-SAS). (e) All angiosperms (RPE-ANG). Symbols + and − indicate the sign of correlations. Individual explanatory variables with a contribution <0.03 are not represented

With regard to differences among superclades, all angiosperms (RPE-ANG) and eudicot groups (RPE-EUD, RPE-SAS and RPE-SRO) presented similar regression patterns. Considering individual groups of environmental variables, the edaphic predictor explained the greatest variation of RPE for all these groups (Figure 4a,c,d; Figure S3a,c,d) and had a negative effect in all cases. As for climatic variables, annual precipitation had a positive effect on RPE and it was the climatic variable showing the highest relative importance (except in RPE-SRO; Figure 4c), while precipitation seasonality had a negative effect. Topographic variables had low weight; slope was in general the most significant and had a positive effect. Thus, for endemic eudicots of the study area, high values of soil pH and dry and seasonal climatic conditions are linked to high neoendemism levels (low RPE), while wet mountain areas are potentially linked to palaeoendemism (high RPE). In contrast, RPE of monocots was only positively correlated with precipitation seasonality and negatively with elevation, but both correlations were very weak.


Non-phylogenetic geographical patterns of endemism richness in the Iberian Peninsula (Figure 2a,b) agree with the premise that mountain areas are major centres of species richness and endemism in the Mediterranean Basin (Lobo et al., 2001; Médail & Quézel, 1997; Thompson, 2005) as a result of orographic isolation, geomorphological complexity and buffering of climatic fluctuations (Favarger, 1972; Jetz et al., 2004; Ohlemüller et al., 2008; Dobrowski, 2011; see Notes S1 for details). Indeed, topographic variables (elevation and slope) are those that best explain endemic species richness and range-WE for the Iberian flora, followed by climatic variables (particularly precipitation seasonality; Figure 2c,d). The exceptional proportion of rare endemic species in the Baetic System compared to other Iberian mountain ranges can be explained by those topographic and climatic factors, in combination with historical causes (see Notes S1 for details). The latter include ameliorated climatic conditions during the glacial–interglacial fluctuations of the Pleistocene as a result of the lower latitude, maritime influence and wide altitudinal range (from 0 to >3,000 m a.s.l.), which led to low extinction rates and increased diversification (Carrión et al., 2003; Harrison & Noss, 2017; Médail & Diadema, 2009; Molina-Venegas et al., 2013). However, and much in line with previous studies (see Lobo et al., 2001), soil pH remained a comparatively poor predictor of non-phylogenetic metrics of Iberian plant endemism (see Figure 2c,d), especially compared to phylogenetic metrics (RPE).

4.1 Calcium-rich substrates have fostered recent diversification of Iberian plants

Variation partitioning and relative importance of variables revealed that the edaphic factor is the most important predictor of relative phylogenetic endemism of angiosperms (Figure 4e), which links recent diversification in the Iberian Peninsula to soils developed on carbonate and evaporite rocks (high soil pH). It has been argued that soil type may be more important than climate in determining species composition at a regional scale (Liu et al., 2020). Indeed, biogeographic regionalization of the Iberian Peninsula based on the endemic flora reflects a primary division between the predominantly basic eastern region and the predominantly acidic western region (Buira et al., 2017; Moreno Saiz et al., 2013), and territories with calcareous (basic) soils or with a mixture of both basic and acidic soils are usually richer in plant species (Lobo et al., 2001). In addition, calcareous substrates have been shown to sustain, on average, a larger proportion of range-restricted species than siliceous ones (e.g. Casazza et al., 2008; Médail & Verlaque, 1997; Smyčka et al., 2017) and this is also true for the Iberian flora (Buira, et al., 2020). While the importance of lithology in shaping community structure in the Mediterranean Basin has been pointed out before, quantitative evidence showing its role as a key driver of recent diversification has been lacking until now.

Compared to soil, topography appears to be relatively less important as a predictor of RPE (Figure 4), perhaps because both rapid recent speciation and long-term persistence of endemics can occur in mountainous areas, cancelling each other's signature. In contrast, climatic variables do provide additional insight into RPE. In particular, low annual precipitation and high precipitation seasonality are linked to low RPE values in the Iberian Peninsula. This is in line with the pattern observed in the Baetic-Rifan complex (Molina-Venegas et al., 2017) as well as in other Mediterranean-type regions such as California (Thornhill et al., 2017) and the Cape Region (Verboom et al., 2009), where neoendemism prevails under drier climatic conditions. It is clear that novel climate regimes of summer drought and aridity established since the Pliocene in the Mediterranean Basin (Suc, 1984) have influenced the rates of speciation and extinction and shaped patterns of endemism (Hopper & Gioia, 2004; Rundel et al., 2016). In particular, these climatic conditions have been a decisive stimulus for the recent and rapid diversification of several lineages in the Mediterranean Basin (Vargas et al., 2018; Verdú & Pausas, 2013).

Therefore, centres of recent plant diversification (low RPE and high endemic richness) in the Iberian Peninsula (Figure 3) are defined by a combination of stressful environmental conditions (high soil pH and marked summer drought). Endemics occurring on calcium-rich substrates are faced with nutritional imbalances, and may be additionally influenced by physical constraints and particular biotic interactions (Mota et al., 2017). Physical and chemical limitations imposed by stressful soils have the strongest impact on plant development under drought conditions (Escudero, 1996; Kruckeberg, 2004). Thus, intensification of summer drought may have boosted diversification of plant lineages through repeated specialization in contrasting and stressful soils (Molina-Venegas et al., 2015). This has likely occurred to a larger extent in eastern Iberia, which consists mainly of substrates derived from Cenozoic sedimentary deposits (including limestones, marls, dolostones and gypsum), in some places alternating with outcrops of siliceous materials. In Mediterranean-climate regions of western Iberia (mostly consisting of acidic rocks and edaphically more uniform), centres of recent diversification are rare, and they are primarily located in the siliceous Gredos Mountains (the highest mountain range in central Iberia) and the calcareous outcrops of central and southern Portugal (Algarve region; Figure 3). Limestones and dolostones are also found in the Pyrenees and the Cantabrian Mountains (both within the Eurosiberian bioclimatic region) but high RPE values prevail there, suggesting that, in the absence of summer drought, calcareous substrates have not acted as a driving force of recent diversification.

Many of the most species-rich Mediterranean plant lineages are highly diversified in eastern Iberia (e.g. Limonium, Centaureinae, Antirrhineae, Teucrium, Thymus, Sideritis and Helianthemum represent nearly 40% of the endemics of the area). Closely related endemic species frequently co-occur at a regional scale, supporting that speciation of neoendemics may have taken place in situ. The speciation mechanism known as ‘budding’, in which a new range-restricted species originates within or at the margin of a surviving ancestral species (Crawford, 2010), seems to be common in the Mediterranean-type region of California (Anacker & Strauss, 2014) and it is probably usual also in the Mediterranean Basin (Otero et al., 2019; Papuga et al., 2018). Plants are particularly prone to strong divergent selection caused by fine-scale environmental heterogeneity and, as a result, soil properties and microclimatic conditions may play an important role in speciation and ecological segregation at the regional scale (Rundle & Nosil, 2005; Anacker & Strauss, 2014; Molina-Venegas et al., 2016).

4.2 Contrasting patterns of recent diversification in eudicots and monocots

The analyses based on the largest clades (all angiosperms and eudicots) support that RPE patterns have been shaped by the environmental conditions as described above (Figure 4a,e). However, the correlations become blurred in the analyses of smaller clades, particularly at 10-km resolution. This is partly because the estimation of RPE in a given cell is more affected by the occurrence of outliers (i.e. range-restricted species derived from extremely long or short branches) when reducing the phylogenetic extent. Moreover, mechanisms other than abiotic environmental factors, such as dispersal limitation and competition, may be more important determinants of RPE patterns as phylogenetic extent decreases (Cavender-Bares et al., 2009; Graham et al., 2018).

As in all angiosperms and eudicots, recent diversification (low RPE) in both independent eudicot clades (superasterids and superrosids) is related to high soil pH and dry-seasonal climatic conditions (Figure 4c,d). Most large Iberian plant lineages that have radiated under such environmental settings belong to the superasterid superclade, and there is increasing evidence that these lineages underwent bursts of diversification during the Plio-Pleistocene (e.g. Limonium, Lledó et al., 2005; Centaurea, Hilpold et al., 2014; Antirrhinum, Vargas et al., 2009; Linaria, Blanco-Pastor & Vargas, 2013, Fernández-Mazuecos & Vargas, 2015; Teucrium, Salmaki et al., 2016). Although the RPE pattern of superrosids is more diffuse and values are less spatially correlated, the main Iberian centres of recent diversification are still in the south-east (Figure 3c). Superrosids include some clades with well-documented recent radiations consisting of many neoendemics in eastern and particularly south-eastern Iberia (e.g. Helianthemum, Martín-Hernanz et al., 2019; Erodium, Fiz-Palacios et al., 2010), as well as lineages that are highly diversified in other Iberian regions (e.g. Genisteae in the west; Saxifraga in northern mountains). Although high-elevation environments are well-known centres of recent diversification (Hughes & Atchison, 2015; Smyčka et al., 2017), our analyses only recovered a weak effect of elevation on RPE for eudicots. Nonetheless, some eudicot genera (e.g. Saxifraga, Androsace, Ranunculus) include multiple endemics occurring on the mountain tops of the Pyrenean-Cantabrian range, Sierra Nevada and other mountain ranges, and probably diversified rapidly in these habitats (Boucher et al., 2016; Cires et al., 2012; Dixon et al., 2007; Vargas, 2001).

Contrary to eudicot clades, recent diversification of monocots is not associated with soil conditions, and only weakly with low precipitation seasonality and high elevation (Figure 4b). Large numbers of endemic monocots occur in northern and central-western mountain ranges and in Sierra Nevada, but they are rare in eastern Iberia. Indeed, richness of endemic monocots is higher in areas of high eudicot RPE (correlation between endemic monocot richness and eudicot RPE is 0.43), and it is low or nil in areas of recent eudicot diversification. These results indicate divergent patterns of speciation in monocots and eudicots. Likewise, the lower ratio of endemism in monocots compared to eudicots in the Iberian Peninsula (14% vs. 28%) and in other Mediterranean floras (e.g. Davis et al., 1988; Fennane & Ibn Tattou, 2008) also suggests that environmental conditions boosting recent diversification in certain eudicot lineages have not driven monocot diversification to the same degree. Iberian endemic monocots derived from short-branched clades comprise only the species of Gramineae (mostly Festuca) and Carex, which have diversified largely in high-mountain environments (e.g. Jiménez-Mejías et al., 2017; Marques et al., 2016). This partly explains the low monocot RPE values in some cells of the Pyrenees, Central System and Sierra Nevada (Figure 3b).

4.3 Final considerations

We relied on the RPE metric to identify centres of recent in situ diversification (neoendemism, low RPE) of Iberian plant species and their environmental correlates. It could be argued that low RPE values may also be the result of phylogenetic clustering due to habitat filtering based on phylogenetically conserved traits adapted to particular environmental conditions (Gerhold et al., 2015; Münkemüller et al., 2020; Webb et al., 2002) or coexistence of close relatives with conserved niches but small performance differences (Mayfield & Levine, 2010). However, we assume that the identified environmental factors are related to recent diversification for several reasons. On the one hand, RPE takes into account the geographic range of endemic species, so a low RPE value within a cell rich in endemic species indicates a high concentration of short-branched narrow endemics that very likely diversified in situ or in nearby areas. Thus, the rapid and recent radiation of certain clades linked to particular environmental conditions (marked summer drought and stressful substrates) would increase phylogenetic clustering in concert with environmental filtering, but not exclusively (Molina-Venegas et al., 2015; Smyčka et al., 2017). Further, if environmental filtering were the only driver of phylogenetic clustering in our study, we would expect low richness of endemics in grid cells with low RPE values, which is the opposite of the pattern reported here. On the other hand, our study was carried out at the macroecological scale, so the footprint that biotic interactions may have left on the phylogeny would be diluted by the effect of differential diversification of lineages among regions. Likewise, grid cells are relatively large (10 and 50 km), so neighbouring species occurring within a cell do not necessarily occur in the same community or compete among them.

Centres of palaeoendemism (high RPE), however, should be interpreted with caution in our study. In most cases, high RPE values are the result of endemism being represented by distant relatives rather than by relicts from past climate changes. In fact, impoverished lineages of pre-Mediterranean origin compose only a small fraction of the modern Mediterranean flora (Rundel et al., 2016), and Iberian narrow endemics representing ancient lineages are very rare (Vargas et al., 2020). Nevertheless, high RPE values were generally obtained in areas where there is available evidence for the presence of palaeoendemic species, such as the eastern Balearic Islands (Mallorca and Menorca), the central-southern Pyrenees, the mountains of northern Portugal and northwestern Spain, and the Cazorla Mountains in the Baetic region (Figure 3; see Notes S1 for details). Further investigation, possibly using phylogenies of endemic and non-endemic species, will shed more light on the distribution of palaeoendemic Mediterranean lineages.

Patterns of endemic richness and RPE, as well as variation partitioning and relative importance of variables were largely robust to different spatial resolutions. However, phylogenetic uncertainty had a stronger effect at the finer spatial resolution (10 km), and the explanatory power of regressions was significantly lower at this resolution. All metrics used in this work depend on range sizes and on the pool of species recorded in each cell. As a result, metrics are affected by underestimated species ranges and by incomplete sampling of grid cells (Thornhill et al., 2017). Problems associated with biased distributional data are alleviated by the use of larger grid cells, although this strategy results in coarser patterns that are more difficult to interpret.

Our results provide robust insights into the environmental factors driving recent plant diversification in the Mediterranean Basin, including a role of soil properties that had not been quantified before. Additionally, we show evidence of contrasting environmental drivers of diversification in eudicots and monocots, which highlights the importance of analysing spatial phylogenetic patterns at multiple phylogenetic scales to get a better understanding of the processes that shape biodiversity.


We thank Miquel Porto and Xavier Font for providing us with data from Flora-on and BDBC respectively. This work was partially funded by the Spanish Government through the Flora iberica project (CGL2017-85204-C3-1-P). R.M.-V. was supported by a TALENTO fellowship of the regional government of Madrid, Spain (2018-T2/AMB-10332). M.F.-M. was supported by a Special Intramural Project of the Spanish National Research Council (CSIC, 201930E078).


    A.B. designed the research, prepared the database, conducted regressions and variation partitioning and wrote the manuscript; R.M.-V. built the phylogenies, conducted the phylogenetic analyses and contributed to the writing; M.F.-M. contributed to interpretation of results and writing; C.A. contributed to drawing up the endemic species list.


    The peer review history for this article is available at https://publons.com/publon/10.1111/1365-2745.13527.


    Data are available from Dryad Digital Repository https://doi.org/10.5061/dryad.c59zw3r5r (Buira, Fernandez-Mazuecos, et al., 2020).