Journal list menu
Estimating national population sizes: Methodological challenges and applications illustrated in the common nightingale, a declining songbird in the UK
Abstract
- Estimation of national population size can be important for setting conservation priorities, but its methodology has received little critical attention. Sites for highly aggregated species are often prioritised if they contain 1% of national or biogeographic populations, but the utility of this approach for other species is unclear.
- To make recommendations for study design, we present methods used to estimate the UK population size of the common nightingale Luscinia megarhynchos. We assess the sensitivity of the population estimate to the analytical method used and identify sites of national importance for this territorial songbird.
- Survey effort was directed by prior knowledge of the species distribution and the survey design maximised detectability by focussing on the period of greatest song output. We used three different statistical methods to account for detectability, estimating that 55%–65% of the national population was detected during surveys.
- Birds in areas not known to contain the species accounted for 13%–23% of the population estimate. Methods to account for these individuals contributed the greatest uncertainty to the results, due to the difficulty of surveying a very large sample of random sites and consequent need to stratify the sample.
- The 12 derived estimates ranged between 5,094 and 5,938 territorial males, with the confidence limits ranging from 4,764 to 6,534. Site delimitation, using clustering based on nearest-neighbour distances, identified one site clearly of national importance and several others potentially nationally important, depending on the population threshold and clustering distance used.
- Synthesis and applications. National population estimation is difficult and requires that species-specific variability in detectability, and individuals present outside surveyed areas are accurately accounted for through survey design and statistical analysis. Accounting for these sources of error will not always be possible and will hamper efforts to assess true population size and consequently to determine whether sites, however defined, exceed critical thresholds of importance. Resources may be better invested in other activities, for example, in generating population trends based on relative indices. The latter are generally easier to produce, potentially more robust and arguably more suitable for many conservation applications.
1 INTRODUCTION
Conservation decision making often requires estimates of population metrics at the scale of geopolitical units, such as entire nations or even continents. At these scales, indices of relative population size for well-monitored taxa such as birds are essential for determining population trends, for example, through standardised counts of defined areas in successive years (PECBMS, 2017a; Robinson et al., 2016). Trends derived in this way can be used to produce indicators and alerts for conservation priorities (BirdLife International, 2017; Eaton et al., 2015; Gregory et al., 2005; PECBMS, 2017b). National-scale estimates of absolute population size are used for several conservation purposes, most frequently in setting conservation priorities (briefly reviewed by Newson, Evans, Noble, Greenwood, & Gaston, 2008), particularly assessment of the importance of individual sites or the cumulative proportion of a species population protected by a network of sites.
A commonly used critical standard for designation of sites for special protection, for birds at least, is that they contain 1% of the population of a national or wider geographical area (the “1% rule”; Drewitt, Whitehead, & Cohen, 2015; Fuller & Langslow, 1986; Heath & Evans, 2000). This designation criterion requires good estimates of populations at national and site levels. The efficacy of such an approach depends on the identification of a suite of sites that can, in practice, be designated. While this approach could identify the full complement of sites necessary to protect populations of aggregated species, such as colonially nesting seabirds and flocks of wintering waterbirds, additional approaches may be required to identify a similarly robust set of sites being identified for highly to moderately dispersed species, including many common songbirds and non-colonial birds of prey.
In reality, species vary continuously from widely dispersed to highly aggregated, depending on social factors and habitat specialism. Rarity, manifested as small population size and restricted range, may also increase the likelihood that individual sites hold nationally important populations as such species become concentrated in the most suitable habitats (e.g. Hewson & Fuller, 2007). Consequently, depending on how sites are defined, the 1% method may also be used to identify sites for species showing lesser degrees of aggregations, such as some territorial songbirds, helping to diversify the sites protected within a network.
Assessment of the importance of individual sites according to critical thresholds requires accurate estimation of the population sizes both within the sites and across the wider area. Survey methods must take account of both spatial and temporal variation in detectability, which may be severe depending on the characteristics and behaviour of the species (Alldredge, Simons, & Pollock, 2007; Best, 1981; Bibby, Burgess, Hill, & Mustoe, 2000; Robbins, 1981), since low or biased detectability might not be satisfactorily compensated for later, while analysis must fully account for undetected individuals.
Some methods which have the theoretical capability to assess local absolute population sizes do not do so in practice, because their assumptions are not met. In distance sampling, for example, detection on the transect line is often not 100% (Bächler & Liechti, 2007), in part due to species-specific temporal patterns of detectability (e.g. due to diurnal and seasonal variation in song output), resulting in estimates of detectability for only a subset of the population. Thus, national population estimates from scaling up local population estimates derived from distance sampling in standard monitoring schemes (Newson et al., 2008) may be underestimates. Consequently, species with variable detectability may especially require bespoke survey methodology.
National population estimates require assumptions to be made since they need to estimate the number of individuals across large areas, with limited survey resources and imperfect knowledge of species distribution. Comprehensive survey of the area of interest will usually not be possible, necessitating methods of sampling that allow extrapolation to account for birds in unsurveyed sites. Some studies have used species distribution models derived from data from multi-species monitoring schemes to generate national population estimates for more common and widespread species (e.g. Kéry & Royle, 2016). Species distribution models may not, however, adequately predict a species’ abundance when it has specific habitat requirements that cannot be measured on large spatial scales, as in many forest species (Hewson, Austin, Gough, & Fuller, 2011), or where social factors are important, where it is not in equilibrium with its environment, or where habitat relationships are not spatially and temporally constant, as in many declining species (Fuller, 2012). Several of these issues apply here and, consequently, we use surveys of a random sample of sites stratified according to suitability for the species to scale up for birds outside the known distribution. Furthermore, for species with variable detectability, the generic field methods of standard monitoring surveys may result in inadequate effective survey visits and low detection rates, so species-specific methodology is required. For scarcer species or those with intermediate levels of dispersion, the coverage of sites containing the target species may also be too low, yet these are typically those for which population estimates are required. Specific targeting of survey sites known, or likely, to contain the species is necessary. We illustrate this approach for a territorial songbird using prior knowledge of its uneven distribution coupled with the results from modelling an extensive independent atlas dataset (Balmer et al., 2013).
We explore requirements for studies estimating national population sizes and make recommendations for their design using a data-rich example of a territorial songbird in the United Kingdom, the common nightingale (Luscinia megarhynchos, Brehm, 1831, hereafter “nightingale”), which has declined by 61% in 25 years (Robinson et al., 2016), due to loss and degradation of habitat as well as possibly wider-scale factors (Holt, Hewson, & Fuller, 2012). It has consequently been placed on the Red List of Birds of Conservation Concern in the United Kingdom (Eaton et al., 2015). The dataset was based on a high volume of targeted volunteer and professional fieldworker effort, optimised using knowledge of patterns of song output, the key determinant of the species highly variable detectability, which varies according to time of day, season and whether an individual is paired (Amrhein, Korner, & Naguib, 2002; Amrhein, Kunc, & Naguib, 2004; Amrhein, Kunc, Schmidt, & Naguib, 2007). We assess the sensitivity of the estimate to assumptions made using a series of analysis options, which also presents an opportunity to consider the consequences of potential data limitations in other real-world scenarios and how best to deal with them.
Using the same dataset, we then demonstrate how population estimates can identify sites of national importance for dispersed species, using an objective method of cluster identification to define sites. The results presented were instrumental in the notification of Chattenden Woods and Lodge Hill Site of Special Scientific Interest, an area of scrub and grassland in south-east England, enlarging the existing Chattenden Woods SSSI and adding a nationally important breeding nightingale population as a named feature (Natural England, 2013a, 2013b). When data are used in this way to support the establishment of protected areas, understanding the sensitivity of the estimate to the methods used and assumptions made is of paramount importance (Brouwer, Baker, & Trolliet, 2003).
2 MATERIALS AND METHODS
2.1 Sampling design and stratification
A nightingale field survey was carried out in the United Kingdom, primarily in 2012, with gaps in coverage addressed in 2013. The survey unit was the tetrad (2 × 2-km national grid square). A total of 2,733 survey tetrads were selected, comprising 2,433 tetrads known to be occupied by nightingales in the breeding season and a sample of 300 tetrads not known to hold breeding nightingales but within wider landscapes known to be occupied by the species. Because complete coverage of all survey tetrads was not feasible, 1,519 tetrads, including all “additional tetrads” (see below) and a sample of 1,219 known sites, were defined as “priority tetrads” for observer assignation to minimise biases in the sites that were covered in the survey (Appendix S1). Tetrads were assigned to a total of 1,281 different surveyors, either BTO volunteers or professional fieldworkers (Appendix S1).
2.1.1 Selection of tetrads occupied by nightingales
All tetrads known to be recently occupied by nightingales (“known sites”) were identified, including tetrads occupied in the previous national survey of breeding nightingales in Britain conducted in 1999 (Wilson, Henderson, & Fuller, 2002) and other tetrads occupied since 2007, identified from County Reports, the 2007–2011 bird atlas (“the Atlas,” Balmer et al., 2013) and BirdTrack (BTO/RSPB/BirdWatch Ireland/SOC/WOS, 2011). Atlas records from April to July and BirdTrack records from April to June were used, with the exception of suspected migrants (away from suitable breeding habitat, present on just a single day or not singing during April/May).
2.1.2 Additional tetrad selection
Predicted abundance of nightingales, previously achieved by modelling within the Atlas (Balmer et al., 2013), was used to produce a sample of tetrads where nightingales were not known to be present but which were situated in potentially suitable landscapes (“additional tetrads”). Sampling was stratified according to cumulative predicted nightingale abundance over 22 × 22-km squares centred on each tetrad by buffering each one with five concentric layers of tetrads (Appendix S1). This ensured representation of tetrads in landscapes with high predicted abundance of nightingales, which may be especially important for national population estimation.
2.2 Field methods
2.2.1 Survey protocol
Singing males formed the basis of the field methods, as song is usually strong, and they can be detected over considerable distances (in favourable conditions over several hundred metres). An area count method (Bibby et al., 2000) was adopted in which observers were instructed to search all areas of suitable habitat (see Appendix S1 for additional details).
Dates and times of surveys were chosen to cover the periods of peak song output of nightingales in the United Kingdom. Tetrads with any suitable habitat were searched on a minimum of two visits, ideally a week or so apart and between 21 April and 14 May. This period encompasses the end of the territory establishment period and up to the point at which most females will be incubating in a typical year, that is, the period before the diurnal territorial song output of paired males decreases (Amrhein et al., 2002, 2007). Although some males will still be arriving and establishing territories during this period, this can be accounted for by the effect of date in the detectability analyses. Surveys occurred from one hour before dawn to 08.30 hr—this is the time of day when song output of paired and unpaired males is most equal and includes the period when it is at its highest (Amrhein et al., 2004, 2007). Additional visits, both inside and outside these windows, were made in some cases but those made between 23.00 and 03.00 hr were not used in the population estimate. Previous surveys (e.g. Wilson et al., 2002) have included surveys conducted at night, but it is now known that detection at this time is heavily biased towards unpaired males (Roth, Sprau, Schmidt, Naguib, & Amrhein, 2009).
2.3 Analysis methods
2.3.1 Detectability modelling for surveyed known sites
The analytical “unit” was the territorial male. Raw survey counts of territorial (singing) males were corrected for birds missed during surveys using detectability modelling. This estimates the overall probability of territories being detected, using the number of times individual males were found or missed during repeated surveys, together with the effects of the date, time of day and duration of each visit across surveys of all tetrads applied on a tetrad-by-tetrad basis. Two methods were used: an “Abundance” model using the count of singing males detected per visit to each tetrad; and a “Territory” model considering the detectability of individual singing males on a territory-by-territory basis, which is broadly similar to the approach used by Dawson and Efford (2009) to account for detectability of singing ovenbirds Seiurus aurocapilla. Each individual territory was sampled on 1–23 (median two) visits and the number of detections per territory ranged from 1 to 17. Five tetrads for which details of individual visits were not available were excluded. The 62 territories from these tetrads were subsequently added to the population estimates without any detectability correction—detectability would in any case have been very high as 42 were recorded in a single tetrad to which six unspecified visits were made. For the “Territory model,” we analysed “capture histories” using the Huggins Robust Design (Huggins, 1989, 1991) and Program mark (White & Burnham, 1999) called via the r package RMark (Laake, 2013; R Core Team, 2014, r version 3.1.2). A second “Territory model” analysis also included a habitat covariate to the analysis, dividing primary habitat within a tetrad into two broad categories corresponding to woodland or scrub. For the “Abundance model,” we analysed tetrad-level nightingale counts on multiple visits using an N-mixture model (Royle, 2004) in Program presence (www.mbr-pwrc.usgs.gov/software/presence.html). See Appendix S1 for full details of the detectability modelling.
Detectability estimates for each tetrad (p) were calculated by combining the modelled estimates of detectability for each visit to the tetrad. Detectability-corrected counts for use in further analyses were then calculated as N = n/p, where the integer N is the detectability-corrected total for the tetrad, p the overall detectability estimate for a tetrad and n the total number of territories detected in that tetrad across all visits.
2.3.2 Accounting for nightingales in unsurveyed known sites
The number of nightingales present in the 344 known site tetrads that were not covered in the survey was estimated using two methods: (1) by prediction from the relationship between the final estimated counts (the higher of the detectability-corrected count and the number of territories detected in the main survey plus additional territories from casual records—see Appendix S1) and the predicted abundance from the model used to generate the map of abundance for the Atlas; and (2) by extrapolation from the difference between abundance in the 1999 nightingale survey and the final estimated counts in the 2012 survey in tetrads covered in both surveys. See Appendix S1 for further details of these methods. For both Method 1 and Method 2, 1,000 bootstrap replicates (with replacement) of the datasets were created and the regression carried out (Method 1) or ratio calculated (Method 2) using each one. The median and the 2.5 and 97.5 percentiles were used as the estimates and their 95% confidence intervals.
2.3.3 Accounting for nightingales in other unsurveyed tetrads
The number of nightingales present outside known sites (i.e. in tetrads not known to have recently been occupied by nightingales) was estimated using the rate at which nightingales were estimated to be present using the final estimated counts in all surveyed additional tetrads. This rate was then multiplied by the number of non-survey tetrads within only those 10-km squares known to be occupied by nightingales (according to data from the Atlas and the current survey). This was to avoid applying the correction to a very large number of tetrads outside the range of the species. Given the recent range contraction of nightingale in the United Kingdom, this assumption of no nightingales in 10-km squares not known to be occupied is unlikely to be violated. We created 1,000 bootstrap replicates (with replacement) of the dataset of 267 surveyed additional tetrads and took the median and 2.5 and 97.5 percentiles of total nightingale abundance to derive the estimate and its confidence interval.
The tetrads that the correction was applied to were, on average, considerably less suitable for nightingales than the surveyed additional tetrads, according to the predicted nightingale abundance from the Atlas (0.032 and 0.109, respectively, giving a ratio of 0.409 after transforming to the abundance scale of this survey using the GLM outlined in the previous section). This was partly due to stratification of selection of additional tetrads and partly due to observers surveying the random tetrads most likely to hold nightingales (33 additional tetrads were not surveyed). Two methods were used to reduce this bias: (1) 47 additional tetrads surveyed in the additional fieldwork in 2013 were excluded from the dataset, as the objective of that year's fieldwork was to fill in gaps in the coverage of known nightingale sites and any additional tetrads covered then were probably done so on the basis of their likely occupancy. (2) All additional tetrads were included but the total count for unsurveyed tetrads was scaled down by the ratio of relative suitability of the two sets of tetrads (0.409).
2.4 Population estimate methods
A series of population estimates was calculated by combining the estimates for groups of tetrads derived as described above. All 12 combinations of the following groups of estimates were used: surveyed known sites (three levels)—Huggins, Huggins (habitat), N-mixture; unsurveyed known sites (two levels)—interpolations from Atlas, ratio of 1999–2012 counts; and other unsurveyed tetrads (two levels)—extrapolation to tetrads in occupied 10-km squares from additional tetrads correcting for their predicted suitability for nightingales by two methods.
The confidence intervals for each estimate were calculated by combining those for each group of tetrads using the delta method (Powell, 2007; Seber, 2002) and those for the best estimate by combining the confidence intervals for the 12 estimates in the same way.
2.5 Site assessment methods
Territories detected in the survey were grouped into sites by assigning all territories to site clusters, based on maximum distances between individual territories. This was undertaken for a series of distances, below which individual territories were regarded as “connected” to the same cluster, in order to examine sensitivity of the results to the distance chosen. Distances between 200 and 600 m (50-m intervals) and 750 and 1,000 m were used, informed by the distribution of nearest-neighbour distances across this entire dataset. Sites were considered to be nationally important if they comprised in excess of 1% of the highest upper confidence interval for the 12 population estimates produced, that is, if they contained more than 65 territories. As only territories detected during the surveys could be mapped and therefore contribute to site assessment, we also checked the detectability-corrected total of territories for tetrads comprising each site defined. See Appendix S3 for full details of methods used.
3 RESULTS
3.1 Detectability modelling
A total of 2,356 tetrads were surveyed, comprising 2,089 known sites and 267 additional tetrads (Figure 1). A total of 3,266 territorial males were recorded during the field surveys. Ninety-four per cent of visits (n = 3,015) were conducted in 2012 and 6% in 2013. Once tetrads for which data were not broken down by visit were excluded, the dataset included field visits conducted on 44 dates within a year to 2,351 tetrads and during which 3,204 nightingale territories were detected, meaning 62 territories in five tetrads were not included in any detectability analysis. In all three analyses, the best model included date, start time and duration as covariates—their effects are shown in Figure 2.
3.1.1 Territory model: Huggins Robust Design model
The most general Huggins Robust Design model was identified as the most parsimonious by AICc and possessed most support (Akaike weight = 0.97; Appendices S2 and S4) prior to including a habitat covariate. Mean detectability per tetrad was 0.795 (range 0.269–1). This model estimated abundance of occupied nightingale territories, N, to be 3,995 (SE 41, 95% CI 3,918–4,081).
After considering additional a posteriori models with a habitat covariate, our best model was the most general Huggins Robust Design model with habitat as an intercept and as an interaction with the five original time-varying individual visit covariates (Akaike weight = 0.77; Appendices S5 and S6). Abundance of occupied nightingale territories was estimated by this model to be 1,281 (SE 21, 95% CI 1,243–1,326) in woodland (habitat 1) and 2,347 (SE 32, 95% CI 2,288–2,415) in scrub (habitat 2). The detectability-corrected total of territories excluded from this analysis was 356. Summing these habitat-specific abundances, and the number of nightingale territories excluded due to lack of habitat data, closely approximated the estimate of 3,995 obtained with the initial model: 1,281 + 2,347 + 356 = 3,984 (95% CI 3,908–4,060).
3.1.2 Abundance model: N-mixture model
Our most general a priori N-mixture model was identified as most parsimonious by AICc and possessed all support (Akaike weight = 1.00; Appendices S7 and S8). This model estimated the mean abundance of nightingales per tetrad (λ) to be 1.62, SE = 0.041, 95% CI = (1.55–1.71) because eβ0 = e0.48 = 1.62. Eighty per cent of tetrads were estimated to be occupied by nightingales (ψ = 0.80, SE = 0.0081, 95% CI = [0.79–0.82]), given that (1−e−1.62) = 0.80. Total nightingale abundance for all 2,351 tetrads was estimated to be 3,816 birds (SE 92.63, 95% CI 3,639–4,002). This abundance estimate is similar to the two estimates from the Huggins Robust Design, and the confidence intervals include the best estimates from those models.
3.2 Accounting for unsurveyed known sites
The bootstrap estimates for number of territorial males in unsurveyed known sites were as follows: Method 1 (Atlas interpolation) median 540 (CI 491–595); Method 2 (extrapolation using change between 1999 and 2012–2013) median 595 (CI 534–657).
3.3 Accounting for nightingales in other unsurveyed tetrads
Method 1: The sum of final survey totals across the 220 tetrads covered in 2012 was 37 nightingales. The bootstraps estimated 1,286 (median, CI 735–1,910) territorial nightingales were present in the 8,082 unsurveyed tetrads not known to contain the species. Method 2: Across all 267 surveyed additional tetrads, the sum of final survey totals was 46 nightingales. The bootstraps estimated that 676 (median, CI 436–977) nightingales were present in the 8,082 unsurveyed tetrads, after applying the correction factor for the predicted suitability of tetrads from the Atlas.
3.4 Population estimates
The 12 population estimates ranged between 5,094 and 5,938, with the individual confidence limits ranging from 4,764 to 6,534 territorial males (Table 1). The mean of the 12 estimates was 5,542 territorial males (CI 5,404–5,680). The means of the confidence limits for the 12 population estimates were LCL 5090 and UCL 5995. Birds detected during surveys comprised 55%–64% of population estimates; detectability-corrected totals comprised 66%–76%, birds in tetrads not known to contain the species comprised 13%–23% and birds in all unsurveyed tetrads comprised 23%–33%.
Surveyed tetrads | Unsurveyed tetrads | Other tetrads | Population | LCL | UCL |
---|---|---|---|---|---|
Abundance | Atlas interpolation | 2012 only | 5,704 | 5,087 | 6,321 |
Abundance | Atlas interpolation | All tetrads corrected | 5,094 | 4,764 | 5,424 |
Abundance | 1999 vs. 2013 comparison | 2012 only | 5,759 | 5,141 | 6,377 |
Abundance | 1999 vs. 2013 comparison | All tetrads corrected | 5,149 | 4,817 | 5,481 |
Huggins | Atlas interpolation | 2012 only | 5,883 | 5,288 | 6,478 |
Huggins | Atlas interpolation | All tetrads corrected | 5,273 | 4,986 | 5,560 |
Huggins | 1999 vs. 2013 comparison | 2012 only | 5,938 | 5,342 | 6,534 |
Huggins | 1999 vs. 2013 comparison | All tetrads corrected | 5,328 | 5,039 | 5,617 |
Huggins with habitat | Atlas interpolation | 2012 only | 5,872 | 5,277 | 6,467 |
Huggins with habitat | Atlas interpolation | All tetrads corrected | 5,262 | 4,976 | 5,548 |
Huggins with habitat | 1999 vs. 2013 comparison | 2012 only | 5,927 | 5,331 | 6,523 |
Huggins with habitat | 1999 vs. 2013 comparison | All tetrads corrected | 5,317 | 5,029 | 5,605 |
3.5 Site assessment
A site named “Lodge Hill” (with territories almost entirely contained within the Chattenden Woods and Lodge Hill SSSI boundary) was identified as the most important site for this species in the United Kingdom. It was identified as nationally important using clusters derived from nearest-neighbour distances of 350 m and above (Table 2) based on the conservative benchmark we used to assess the 1% threshold (>65 territories). Two other sites (Colchester Barracks and Theale) were important based on nearest-neighbour distances over 500 m, while clustering based on distances 750 and 1,000 m resulted in the identification of one further site each, respectively, and the aggregation of smaller sites identified separately at shorter distances (Table 2). Detectability correction at the tetrad level made little difference (0–2 territories) for all sites except Reading, where an increase of five was due largely to territories outside the site. See Appendix S9 for further details.
Distance (m) | Site rank | Site name | Territories (n) |
---|---|---|---|
350 | 1 | Lodge Hilla | 67 |
400 | 1 | Lodge Hilla | 84 |
450 | 1 | Lodge Hilla | 87 |
500 | 1 | Lodge Hilla | 88 |
2 | Colchester Barracksb | 76 | |
3 | Theale | 66 | |
550 | 1 | Lodge Hilla | 88 |
2 | Colchester Barracksb | 76 | |
3 | Theale | 66 | |
600 | 1 | Lodge Hilla | 88 |
2 | Colchester Barracksb | 76 | |
3 | Theale | 66 | |
750 | 1 | Reading and Theale | 108 |
2 | Lodge Hilla | 91 | |
3 | Ebernoeb | 87 | |
4 | Colchester Barracksb | 80 | |
1,000 | 1 | Colchester Barracks and Fingringhoe Wickb | 138 |
2 | Reading and Theale | 109 | |
3 | Ebernoeb | 101 | |
4 | Lodge Hilla | 91 | |
5 | Wisborough Greenb | 79 |
- Distance = nearest-neighbour distance used for clustering; site rank is based on number of territories for each distance; territories is the number of territories included in each cluster.
- a Territory cluster on SSSI notified for breeding nightingales.
- b Territory cluster on or partly on SSSI.
4 DISCUSSION
National population estimation is important for conservation prioritisation, at both species and site levels (Drewitt et al., 2015; Heath & Evans, 2000; see also Newson et al., 2008). The example we present here of a localised territorial songbird from the United Kingdom is unusual because of the high-quality background information available for planning the study, as well as the extensive dataset collected using bespoke methodology. This presents an opportunity to discuss best practice in population estimation and to assess the robustness and impact of different analysis options. These are informative for other studies where data quality and/or availability are lower or other types of species are studied. In particular, it is notable that even in the favourable conditions of this survey birds detected in the survey comprised only 55%–65% of the total population estimates, demonstrating the importance of controlling for detectability within surveys and for birds in unsurveyed sites.
While common and widespread species may be adequately surveyed by extensive standard monitoring scheme datasets (e.g. Kéry & Royle, 2016), this is not the case with scarcer species such as the nightingale in the United Kingdom. We directed survey effort using extensive prior knowledge of the species’ localised and uneven distribution. Nonetheless, areas not known to contain the species contributed 13%–23% and all unsurveyed sites contributed 23%–33% to the population estimates—in other cases, it is likely that these proportions will be higher. The availability of an extensive, independent dataset of predicted nightingale abundance from the Atlas (Balmer et al., 2013) allowed us to stratify our selection of “additional tetrads” in a manner analogous to that presented by Guisan et al. (2006). To ensure that they were adequately sampled within the available survey resource, all tetrads with very high predicted nightingale abundance were placed in the sample. It was, therefore, necessary to explore alternative methods of scaling up, in order to account for this bias in suitability in the surveyed tetrads. This resulted in estimates for birds in unsurveyed tetrads not known to hold the species that differed by 90% of the smaller estimate, with the larger estimate probably being too high as the difference in nightingale suitability was not fully controlled for (Appendix S10). Clearly, any attempt to select survey sites according to suitability for the target species when extrapolating outside a species’ known distribution must account for biases that result. The uncertainty of these estimates was also high, and the confidence intervals being 69% and 85% of the estimates for this group of tetrads, suggesting a larger sample would have been beneficial.
Accounting for birds outside the surveyed areas is perhaps the biggest challenge in national-scale population estimation and necessitates surveying as large a sample of random sites as possible—this is especially important when lack of prior knowledge or survey resources prevents a high proportion of the population being surveyed and where there is a lack of suitable covariates for predicting species occurrence. Survey datasets used for scaling up through modelling must, however, be extensive enough to appropriately constrain the process; extrapolation across too wide an area could make a very large difference to the population estimate. This was why, in our case, with excellent knowledge of the species distribution, we restricted extrapolation to tetrads falling within 10-km squares known to have been recently occupied by the species.
In-depth knowledge of temporal variation in detection, combined with the high conspicuousness of singing nightingales, allowed us to tailor our survey methods to maximise the proportion of birds available for detection during core survey periods, since poor or biased initial detection is unlikely to be satisfactorily compensated for statistically. Nonetheless, 14%–18% of the total estimated to be present in surveyed tetrads went undetected, probably an accurate proportion given the similar estimates (3,816–3,984 territories) from the three statistical methods used. Accurate estimation at the site level is critical for extrapolation to national scale and requires recognising and addressing the limitations of methods used to control for detectability, both inherent to the statistical methods (e.g. estimate of detectability on the transect line in distance sampling: Bächler & Liechti, 2007) and to the ecology of the organisms (e.g. being aware of temporal as well as spatial patterns of detectability and tailoring field methods accordingly).
Repeated visits to survey units at critical periods are important to allow low and/or temporally variable detectability to be controlled for but, where survey methods have been optimised to maximise detectability within visits, the number of visits should be minimised to allow more survey units to be sampled, as in this study where the median number of visits was 2. Survey units should be large relative to the size of territories of the organisms surveyed to minimise the overlaps of their boundaries, since this could both increase the effective survey-unit area (Kéry & Royle, 2016) and reduce apparent detectability (Appendix S10). In our study, any such effects were offset to at least some extent by other factors (Appendix S10), ensuring that potential biases are balanced as far as possible is important, given that the magnitude of each may be impossible to determine (Kéry & Royle, 2016).
In real-world applications, site designations are harder to make based on detectability corrected rather than actual counts. Actual counts may be necessary for site delimitation, so ensuring that survey effort is high at known or suspected hotspots is important. We achieved this through increasing the number of visits during the critical survey period, hence the small difference that detectability-correction made in key aggregations. Where high survey effort is not possible or does not result in a very high proportion of organisms being detected, assessment of site importance will be conservative unless consideration of detectability-corrected site-level populations can be justified. This further emphasises the need for careful and robust survey design.
Delimiting sites by clustering of territories using a variable nearest-neighbour distance demonstrated the efficacy of combining site-level and national-scale population estimates for site prioritisation. One key site (Lodge Hill) was identified as clearly nationally important, even under conservative scenarios of site definition and assessment. Other sites were identified as important at higher clustering distances, including when aggregating smaller sites which could themselves have been nationally important using some assessment benchmarks, depending on the population estimates used and their confidence intervals.
In conclusion, the substantial agreement between the 12 estimates produced, with the highest just 17% greater than the lowest, shows that it is possible to produce national population estimates that are not highly contingent upon the methods used. This is important as it may not be possible to know in advance which method is best, and it is necessary to understand how methods might impact upon the results. Our estimate is most likely to be a slight overestimate (Appendix S10). The previous population estimate of 6,700 in 1999 (Wilson et al., 2002) was almost certainly a significant underestimate, primarily as detectability was not adequately considered (Appendix S10).
This paper shows that accurate national population estimation can require high quantities of data that are unattainable in many circumstances. Some judgements will always be necessary when attempting to scale up population estimates to national scales and since these assumptions can rarely be tested in the field, it is essential to use alternative methods to assess the sensitivity of the estimates to various realistic assumptions. Where population estimates are absolutely necessary for conservation (e.g. in the application of widely used critical thresholds for site designation), recognition and acknowledgement of the limitations should be given, especially where data limitations prevent the most rigorous approaches (Brouwer et al., 2003). Each study is unique in terms of species biology, the quality of background information available to design the study and the amount of survey effort available, but some points are generally applicable. For instance, every effort should be made to account for all components of detectability and for uncounted parts of the populations. Generally, it is probably easier to generate reliable indices of relative abundance than to accurately determine absolute population size. Hence, careful periodic sampling of populations to generate indices of population and distribution trends may be a more robust and attainable approach for many conservation applications (e.g. Harrison et al., 2014; Mace et al., 2008), although this may not help identify individual sites of high importance.
ACKNOWLEDGEMENTS
We thank BTO survey volunteers and coordinators for their efforts and Anglian Water for funding the survey. Additional funds came from donations by BTO members and supporters and the Nightingale Supporters Group. We are very grateful for the awards made to the survey by 14 charitable grant-making trusts (including the Chapman Charitable Trust, The William Haddon Charitable Trust, The Mercers’ Charitable Foundation, The Michael Marks Charitable Trust and The Jack Patston Charitable Trust) and thank our Fundraising Team, especially Rachel Gostling and Sam Rider, for organising the Nightingale Appeal. The site assessment work was funded by Natural England. We thank Valentin Amrhein, Ian Burfield, Richard Fuller, Tobias Roth and two anonymous referees for helpful discussion and/or comments on a previous version of this manuscript.
AUTHORS’ CONTRIBUTIONS
C.M.H. wrote the manuscript with contributions from A.J., M.M., G.J.C. and R.J.F. All authors read and commented on the manuscript and approved publication. C.M.H. and R.J.F. designed the survey, G.J.C. and J.H.M. coordinated and organised it. C.M.H., A.J., M.M., G.J.C. and R.J.F. conceived and/or carried out the population estimate methods and analyses. R.S. conceived the site assessment and, with C.M.H. and R.J.F., contributed discussion to development of its methods, which C.M.H. and G.J.C. designed and carried out.
DATA ACCESSIBILITY
Data available from the Dryad Digital Repository https://doi.org/10.5061/dryad.87rb0 (Hewson et al., 2018).