Optimised scat collection protocols for dietary DNA metabarcoding in vertebrates
Summary
- DNA metabarcoding of food in animal scats provides a non‐invasive dietary analysis method for vertebrates. A variety of molecular approaches can be used to recover dietary DNA from scats; however, many of these also recover non‐food DNA. Blocking primers can be used to inhibit amplification of some non‐target DNA, but this may not always be feasible, especially when multiple distinct non‐target groups are present.
- We have developed scat collection protocols to optimise the detection of food DNA in vertebrate scat samples. Using shy albatross Thalassarche cauta as a case study, we investigated how DNA amplification success and the proportion of food DNA detected are influenced by both environmental and physiological parameters. We show that both the amount and type of non‐target DNA vary with sample freshness, the collection substrate, fasting period and developmental stage of the consumer.
- Fresh scat samples yielded the highest proportion of food sequences. Collecting scats from dirt substrates reduced the proportion of food DNA and increased the proportion of contaminating DNA. Food DNA detection rates changed throughout the albatross breeding season and related to the time since feeding and the developmental stage of the animal. Fasting albatross produced scats dominated by parasite amplicons in universal PCR analysis, with little food DNA recovered. Samples from very young animals also produced reduced food DNA proportions.
- Based on our observations, we recommend the following procedures for field scat collections to ensure high‐quality samples for dietary DNA metabarcoding studies. Ideally, (i) collect fresh scats; (ii) from surfaces with minimal contamination (e.g. rock or ice); (iii) collect scats from animals with minimum time since feeding and avoid fasting animals; (iv) avoid young animals that are not feeding directly (e.g. not weaned or fledged) or target larger/older individuals. The optimised field sampling protocols that we describe will improve the quality of dietary data from vertebrates by focusing on samples most likely to contain food DNA. They will also help minimise contamination issues from non‐target DNA and provide standardised field methods in this rapidly expanding area of research.
Introduction
Scat samples provide an important source of DNA that can be utilised in a wide range of molecular ecology studies (e.g. Davison et al. 2002; Prugh et al. 2005). Food DNA present in scats provides a non‐invasive and increasingly popular tool for studying vertebrate diet and can be applied to both predators and herbivores (e.g. Deagle, Kirkwood & Jarman 2009; Zeale et al. 2011; Bowser, Diamond & Addison 2013; Kartzinel et al. 2015). Dietary DNA metabarcoding uses high‐throughput sequencing of small, highly variable DNA regions that survive digestion to identify food species (Pompanon et al. 2012). This may involve identification of a particular food species using species‐specific markers (Jarman & Wilson 2004); food within a taxonomic group using group‐specific markers (Jarman, Deagle & Gales 2004; Murray et al. 2011; Zeale et al. 2011); identification of all food taxa using universal metazoan markers (O'Rorke et al. 2012; Jarman et al. 2013); or a combination of these approaches (Deagle, Kirkwood & Jarman 2009; Bowser, Diamond & Addison 2013). However, characterising the entire diet requires ‘universal’ markers that are capable of amplifying DNA from any food species (King et al. 2008; Jarman et al. 2013).
Universal metazoan polymerase chain reaction (PCR) primers amplify from all eukaryotic DNA, but will inevitably also amplify unwanted DNA from non‐food items (Deagle, Kirkwood & Jarman 2009; O'Rorke et al. 2012). Non‐target DNA within the scat may originate from the animal being sampled, its parasites, gut flora or contamination from external organisms such as insects and vegetation. These sources of DNA can dominate the sequences amplified from a sample, making detection of DNA from food items less effective. Sample sizes must consequently be increased to address the underlying questions of a study, increasing processing costs. In some cases, non‐target DNA amplification can be reduced using a blocking primer to suppress amplification of specific DNA types, such as DNA of the defecating animal (O'Rorke, Lavery & Jeffs 2012). However, development of blocking primers is challenging and food sequences may be inadvertently blocked with this approach. The use of blocking primers becomes more complex when there are multiple non‐target DNA groups present. Improved sampling procedures are another approach for increasing the proportion of food DNA identified in a scat.
Selective scat sampling to improve DNA amplification success in genotyping studies has been investigated (Lucchini et al. 2002; Piggott 2004; Panasci et al. 2011; Vynne et al. 2012), but studies to optimise scat collections for DNA dietary analysis are rare (Oehm et al. 2011). Genotyping studies have investigated how the age of scats (Farrell, Roman & Sunquist 2000; Lucchini et al. 2002; Piggott 2004; Panasci et al. 2011; Vynne et al. 2012), habitat type (Vynne et al. 2012) and season (Lucchini et al. 2002; Piggott 2004) affect DNA detection and genotyping accuracy. Fresh scats collected in dry and cool conditions typically provided the highest amplification success and lowest genotyping error rate. However, the time since an animal defecated is seldom known and proxies for scat age are often required. For example, in maned wolf Chrysocyon brachyurus scats, higher moisture content and odour were found to be positively correlated with amplification success (Vynne et al. 2012). Similarly in brush‐tailed rock‐wallaby scats Petrogale penicillata, colour, consistency and odour correlated well with DNA amplification success (Piggott 2004).
Only one dietary DNA study has examined how field conditions can influence the detection of food DNA. In carrion crow Corvus corone corone scats, exposure to sunlight and rain over a 5‐day period caused significantly lower amplification success of food DNA (Oehm et al. 2011). This was exacerbated by dirt, which may increase the degradation of extracellular DNA (Levy‐Booth et al. 2007). This study used species‐specific markers, which do not amplify non‐food DNA. There are currently no studies that investigate whether targeted sample collections improve the detection of food DNA by universal metazoan markers.
We used shy albatross Thalassarche cauta as a model to develop optimised field protocols for dietary DNA metabarcoding of scats. Albatross are a good example as they follow predictable behavioural patterns, where they return to the colony after feeding and fast on the nest during incubation. This makes scat samples accessible and tests of fasting effects possible. Albatross are known to eat a diverse range of food items, including jellyfish, cephalopods, fish and carrion (Cherel & Klages 1998). Universal metazoan PCR primer sets which amplify from all potential prey groups are therefore needed to screen for all food items. Albatross colonies present far from ideal laboratory conditions. Colonies are typically exposed to extremes of weather, with little or no vegetation cover. Sample degradation by UV and rain is likely to reduce PCR amplification success of exposed scats (Oehm et al. 2011). Contamination from non‐food DNA, such as insects, parasites and fungi, will also reduce the proportion of food DNA detected. Colonies are often remote and expensive to access, on trips that are generally short and/or infrequent, so effective scat collection is imperative.
The optimised field protocols that we developed increase the detection of food DNA by considering the effect of sample freshness; the substrate it was collected from; the bird's breeding and developmental stage; and fasting time. The effects that these factors have on the detection of food DNA are significant enough to be an important consideration when designing dietary DNA studies of vertebrates.
Materials and methods
Case study species
Shy albatross lay one egg from early September to early October. The egg is incubated for 10 weeks (incubation stage), and the hatched chicks are brooded for 3–4 weeks (brood stage). During these two breeding stages, parents alternate nest attendance and foraging trips. After brooding, chicks are left unattended while both parents forage independently at sea to complete chick rearing (chick‐rearing stage; Hedd & Gales 2005). During incubation, foraging trips may last from 1 to 10 days, with an average of 3 days (Hedd, Gales & Brothers 2001); therefore, an incubating bird could be fasting for this period or longer. Foraging trip durations during the brood stage are short at around 1 day and increase slightly during chick rearing to 2–3 days (Brothers et al. 1998; Hedd & Gales 2005).
Field methodology
Shy albatross scat samples were collected at Albatross Island, Tasmania, Australia (40°23′S, 144°39′E). Scat samples were collected during the breeding period over two seasons: 2013/2014 austral summer, chick rearing (late March) only; and 2014/2015 austral summer: incubation (late September), brood stage (mid December) and chick rearing (late March). Samples were collected during the daytime from albatross observed defecating. A small fragment of the non‐uric acid portion of the scat (dark part) was collected using tweezers or a plastic straw. The sample was stored in 80% ethanol and shaken on collection to mix with the ethanol. The only time fresh scats were not collected was when sample freshness was investigated.
Sample freshness
To determine the effect of sample freshness on DNA amplification rates and the proportion of food DNA detected, scats were collected during the chick‐rearing period in 2013/2014 and 2014/2015. The amount of time a scat had been present was unknown when a scat was found. Consequently, we wanted to provide a proxy measure for freshness to allow selection of higher quality dietary material. To test this, scat samples were categorised as follows: (i) ‘Fresh’ when the bird was seen defecating, (ii) ‘Recent’ when the scat was still wet but the bird was not seen defecating (there was often a skin forming on these scats) or (iii) ‘Dry’ when scats were old and had no apparent moisture.
Substrate type
The dominant substrate from which the scat was collected was recorded for all fresh scats collected during chick rearing. Substrate categories included the following: dirt, rock and vegetation.
Breeding stage
To determine whether collecting at different stages of breeding affected the results, we randomly collected from birds in the colony that we saw defecating during incubation, brood guard and chick rearing of the 2014/2015 breeding season.
Developmental stage
When known, the breeding cohort of the bird was recorded as either ‘breeder’ a bird on an active nest or seen feeding a chick; ‘non‐breeder’ a bird at an empty nest; or ‘chick’ which could have been a brooded chick <2 weeks old, or a pre‐fledged chick c. 3·5 months old.
Fasting
To test the effect that fasting had on dietary results, additional scats were collected during incubation. Two study sites within the colony were set up, each containing c. 100 nests. Each bird was marked on the chest with a small dot of non‐toxic stock‐marker, with a different colour used to identify their partner to monitor the amount of time a bird had been incubating. Nests were numbered and checked daily at each site and the bird incubating recorded. When birds were observed defecating, the scat sample was collected and the nest number and bird colour was recorded. The incubation time was categorised as <1 day, 1–2 days and >2 days.
Molecular methodology
Sample storage and extraction
Samples were stored at 5–10 °C for 1 week while in the field, and then, −20 °C until DNA was extracted. DNA was extracted within 2 weeks of collection using a Promega ‘Maxwell 16’ instrument and a Maxwell® 16 Tissue DNA Purification Kit (Madison, WI, USA). Samples were vortexed prior to extraction, and c. 30 mg of each sample was used. The quantity was consistent across extractions, which were all performed by the same person. PCR inhibitor concentrations were reduced in the DNA by mixing this subsample in 250 μL of STAR buffer (Roche Diagnostics, Basel, Switzerland) prior to extraction.
PCR amplification and amplicon sequencing
A PCR primer set for amplifying c. 170 bp of the V7 region of the nuclear small subunit ribosomal DNA gene (18 s; Hadziavdic et al. 2014) was designed manually on an alignment of the region that incorporated representatives from all major animal lineages. A two‐stage PCR process was used to enable amplification of the DNA region and attachment of unique ‘tag’ sequences to each sample which allows amplified samples to be pooled (Binladen et al. 2007).
Stage one PCRs (10 μL) were performed with 5 μL 2× Phusion HF (NEB), 1 μL 100× bovine serum albumin (NEB, Ipswich, MA, USA), 0·1 μL 5 μm of each 18s_SSU amplification primer (Table 1), 0·5 μL of Evagreen, 2 μL faecal DNA and 1·3 μL of water. Thermal cycling conditions were 98 °C, for 2 min; followed by 35 cycles of 98 °C for 5 s, 67 °C for 20 s, 72 °C for 20 s, with an extension of 72 °C for 1 min. Each sample was run in triplicate on a LightCycler 480 (Roche Diagnostics). A negative control containing no template DNA and positive control containing fish DNA were included in each PCR amplification run. If either the negative amplified or the positive failed to amplify, the PCR was rerun. Samples from each experiment were split among different PCR runs to avoid run‐specific biases.
| PCR round | Primer name | Primer sequence (5′–3′) |
|---|---|---|
| 1 | 18s_SSU3_F | TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGGTCTGTGATGCCCTTAGATG |
| 1 | 18s_SSU3_R | GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGGTGTGTACAAAGGGCAGGG |
| 2 | SSU3_Tag_F1 | AATGATACGGCGACCACCGAGATCTACACAGTTCGGACTTCGTCGGCAGCGTC |
| 2 | SSU3_Tag_R1 | CAAGCAGAAGACGGCATACGAGATAGCTTAGGCTGTCTCGTGGGCTCGG |
- Underlined bases in PCR Round 1 are the Miseq tag primer. Bolded bases in PCR Round 2 are an example of the unique tags attached to each sample. A full list can be found in Appendix S1.
If ≥2 replicates of each sample had a ‘crossing threshold’ (ct) score <30, they were combined to reduce biases produced by amplification from low template concentration samples (Murray, Coghlan & Bunce 2015). Pooled samples were diluted 1 : 10 for the second stage PCR. A unique tag was attached to each sample (Table 1) in 10 μL PCRs with 5 μL 2× Phusion HF (NEB), 1 μL 100× bovine serum albumin (NEB), 1 μL of 1 μm of each tag primer (Appendix S1, Supporting Information) and 2 μL of diluted PCR product from stage one. Thermal cycling conditions were 98 °C, for 2 min; followed by 10 cycles of 98 °C for 5 s, 55 °C for 20 s, 72 °C for 20 s, with an extension of 72 °C for 1 min. Four microlitres of PCR product from each sample (n = 511) and the negative controls were pooled and purified from unincorporated reaction components by washing utilising reversible binding to Agencourt Ampure (Beckman Coulter, Brea, CA, USA) magnetic beads, with 1·8 μL of Ampure per microlitre of DNA product. Sequencing of PCR products was performed with a MiSeq genome sequencer, using the MiSeq reagent kit V2 (Illumina, San Diego, CA, USA) (300 cycles) with paired‐end reads.
Bioinformatics
Amplicon pools were demultiplexed based on unique 10‐bp Multiplex IDentifiers on the MiSeq and fastq files processed using usearch v8.0.1623 (Edgar 2010). Reads R1 and R2 from the paired‐end sequencing were merged using the fastq_mergepairs function, retaining only merged reads flanked by exact matches to the 18S_SSU primers and primer sequences were trimmed. Reads from all samples were pooled and dereplicated, then clustered into broad operational taxonomic units (OTUs) using the cluster_otus command (‐otu_radius_pct = 10). Potentially chimeric reads are discarded during this step. Reads for each sample were assigned to these OTUs (usearch_global ‐id 0·9) and a summary table generated using a custom r script. Each OTU was assigned to a taxon by blast against a local data base derived from the SILVA SSU data base release 118 (Quast et al. 2013) with a 0·95 similarity used as a cut‐off for identification. OTUs were categorised into seven groups based on their assumed origin: food, bird, parasite, fungi, plant, contaminant and unicellular (Appendix S2; Jarman et al. 2013). The contamination category included human, insect and ectoparasite sequences. Any sequences that did not match the Silva data base were excluded from analysis (3·2% of the total). Although some species of albatross are known to eat birds, this has rarely been recorded in shy albatross (Hedd & Gales 2001); therefore, in this study, the bird category represented DNA belonging to the albatross.
Statistical analysis
We assessed whether DNA amplification success was affected by the specific variables (sample freshness, substrate, breeding stage, cohort and fasting length). Amplification was deemed successful if the total number of DNA sequences was >500 for a sample. We then examined whether there was a significant difference in the proportion of food DNA detected for each of the variables. Generalised Linear Models (GLMs) were used to test the difference in amplification success, and quasibinomial GLMs (to account for overdispersion in the data) were used to test differences in the proportion of food DNA detected (McCullagh & Nelder 1989). Analysis of deviance (with chi‐squared test) was used to test for significance of predictor terms, with post hoc multiple comparisons by Tukey's method. Analyses were carried out using the R ‘stats’ package (R Core Team 2013), with multiple comparisons using the package ‘multcomp’ (Hothorn, Bretz & Waestfall 2008) and plots created using the package ‘ggplot2’ (Wickham 2009).
Results
DNA was extracted from 598 scat samples, with 511 of these producing ct values <30, with 458 (89%) producing >500 DNA sequence reads. A total of 2·9 million sequence reads were obtained from the single sequencing run, which included 452 305 (15·6%) food sequences (Fig. 1).

Sample freshness
The freshness of scat samples significantly affected the DNA amplification success (
= 7·61, P = 0·02), with fresh scats amplifying better than recent scats, but not better than dry scats (Table 2). Sample freshness also significantly affected the proportion of food DNA in the samples, (
= 31 808, P = 0·02), with fresh scats containing a greater proportion of food DNA than dry scats, but not significantly more than recent scats (Table 2, Fig. 2). Fungi DNA proportions were higher for dry scats than either fresh or recent (Fig. 3).
| Scats obtained | Scats with DNA amplified | DNA Amplification | Proportion of food DNA | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Amplification success | Estimated values | SE | Estimated values | SE | Fitted | ||||
| Sample freshness | Fresh | 127 | 105 | 82·7 | 1·563*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·234 | −1·391*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·193 | 0·20 |
| Recent | 86 | 57 | 66·2 | −0·887*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·327 | −0·513 | 0·369 | 0·13 | |
| Dry | 41 | 30 | 73·2 | −0·560 | 0·423 | −1·253*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·536 | 0·07 | |
| Substrate | Dirt | 90 | 70 | 77·8 | 1·540 | 0·367 | −1·505*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·278 | 0·18 |
| Rock | 104 | 78 | 75·0 | 0·017 | 0·535 | 0·800*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·382 | 0·33 | |
| Breeding stage | Incubation | 79 | 69 | 87·3 | 1·931*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·338 | −2·768+#
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·346 | 0·59 |
| Brood | 166 | 119 | 71·7 | −1·002*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·380 | 1·893+
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·388 | 0·29 | |
| Chick Rearing | 63 | 49 | 77·8 | −0·678 | 0·454 | 1·755#
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·423 | 0·27 | |
| Incubation cohort | Breeder | 50 | 44 | 88·0 | 1·992 | 0·435 | −3·439 | 0·649 | 0·03 |
| Non‐Breeder | 29 | 25 | 86·2 | −0·159 | 0·692 | 1·306 | 0·803 | 0·11 | |
| Brood cohort | Breeder | 60 | 49 | 81·7 | 1·494*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·333 | 0·180+#
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·230 | 0·54 |
| Non‐Breeder | 40 | 31 | 77·5 | −0·257 | 0·505 | −1·659+
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·386 | 0·19 | |
| Chick | 66 | 39 | 59·1 | −1·126*
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·417 | −2·117#
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·429 | 0·13 | |
| Incubation time | Random | 79 | 69 | 87·3 | 1·932 | 0·338 | −2·769+
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·318 | 0·06 |
| <1 day | 52 | 41 | 78·8 | −0·616 | 0·479 | 1·639+
DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.
|
0·415 | 0·24 | |
| 1–2 days | 18 | 13 | 72·2 | −0·976 | 0·626 | −1·928 | 2·108 | 0·01 | |
| >2 days | 29 | 24 | 82·7 | −0·363 | 0·597 | −1·087 | 1·174 | 0·02 | |
- DNA amplification was analysed using binomial GLMs, the proportion of food DNA using quasibinomial GLMs. Superscript symbols indicate significantly different values (Tukey's multiple comparison test) *P < 0·05, #+P < 0·001.


Substrate type
Only a small number of scats were collected from vegetation; therefore, substrate comparisons were only analysed using the two most common surfaces: dirt and rock. The substrate did not significantly affect amplification success (
= 0·001, P = 0·97), but did significantly affect the proportion of food DNA detected (
= 14 805, P = 0·04). Scats obtained from rock contained a higher proportion of food DNA than those obtained from dirt (Table 2, Fig. 2), which contained a higher proportion of unicellular DNA (Fig. 3).
Breeding stage
There was a significant difference observed in the DNA amplification success between breeding stages (
= 7·988, P = 0·02), with scats collected during the brood stage having lower amplification success (Table 2, Fig. 4). The breeding stage greatly affected the proportion of food DNA detected (
= 115 863, P < 0·001), with scats collected randomly during incubation producing significantly lower proportions of food DNA than scats from brood or chick‐rearing stages (Table 2, Fig. 4). Scats collected during incubation were dominated by parasites (98% cestoda; Fig. 5).


Developmental Stage
During incubation, there was no significant difference between breeders and non‐breeders in DNA amplification success (
= 0·053, P = 0·82; Table 2), or the proportion of food DNA detected in scats (
= 11 502, P = 0·09; Table 2, Fig. 4). However, during brood guard, the developmental stage of birds did significantly affect the DNA amplification success (
= 8·711, P = 0·01). Scats from chicks had a lower amplification success than those from breeders (Table 2, Fig. 4). The proportion of food DNA detected was also significantly affected by the developmental stage during brood guard, (
= 88 972, P < 0·001), with scats from breeders containing a much higher proportion of food DNA than those from chicks or non‐breeders (Table 2, Fig. 4). During the brood stage, chick scats had a higher proportion of bird, fungi and plant DNA than breeders, whereas non‐breeder scats were dominated by parasites (Fig. 5).
Fasting
The time a bird spent fasting did not significantly affect the DNA amplification success of the scat (
= 3·01, P = 0·39), but did strongly affect the proportion of food DNA detected within the scat (
= 70 165, P < 0·001). Scats collected from birds incubating for less than a day had a far greater proportion of food DNA detected than scats collected randomly; however, this was not the case for any other incubation length category (Table 2, Fig. 5). Scats from birds that had been incubating longer than 1 day contained predominantly parasite DNA (Fig. 5).
Discussion
Our case study clearly indicates that sample freshness, the substrate the scat was collected from, breeding stage, developmental stage and fasting can all impact the amount of food DNA available for dietary DNA metabarcoding. The scat collection protocol presented here contributes to optimising the amount of food DNA that is identified in vertebrate dietary studies.
Sample freshness
Scat freshness was found to affect both the DNA amplification success and the proportion of food DNA detected. Fresh scats exhibited a higher DNA amplification success than recent scats, but not dry scats. We had expected that both recent and dry scats would have less amplifiable DNA than fresh scats due to degradation during environmental exposure (Oehm et al. 2011). However, dry scats have also had more potential exposure to external contamination, particularly from fungi, which was reflected in the non‐food DNA sequences recovered. When we look specifically at the amount of food DNA amplified from dry scats, this component was significantly less than for fresh scats. Although recent scats had a lower amplification success, the proportion of food DNA detected was not significantly lower than that of fresh scats. This ‘recent’ category contained a wide range of scats, from those defecated within minutes (but not seen), to those exposed for many hours. Therefore, using scats that are still wet may produce dietary information, but larger sample sizes would be required and reliance on small amounts of DNA may reduce data quality (Murray, Coghlan & Bunce 2015).
Samples in this study were collected during the day from a species breeding in an exposed habitat with little protection from UV and rain. Scats collected in protected conditions such as from a shaded area, at night or collected in the early morning may allow amplifiable DNA to persist for longer. For example, in carrion crows, food DNA could be detected for up to 5 days when protected from UV and rain exposure (68% success); however, this was dramatically reduced when scats were left in exposed areas (17·5% success; Oehm et al. 2011). Similarly, Steller sea lion Eumetopias jubatus scats also produced detectable prey DNA for up to 5 days in some samples (Deagle et al. 2005). However, in both of these studies, group‐specific markers were used that detected only food items. In our study, some dry scats still contained food DNA, so it is possible that with the use of group‐specific markers, dietary information may be detectable for longer.
To ensure fresh scats are collected in the field, some studies have captured or contained animals (Kartzinel & Pringle 2015; Lopes et al. 2015), or placed sheets to collect fresh faeces (Deagle et al. 2010; Vesterinen et al. 2016). When manipulation of the surrounding environment is not physically or ethically possible, alternative sampling strategies are required. In optimising scat collections, we did not seek to determine the amount of time in hours or days that a scat could be collected, as this information is unknown when a scat is found. Instead, we wanted to provide a proxy that allows field biologists to selectively collect scats that will provide high‐quality dietary material. Unfortunately, wet recent scats did not provide as much data as fresh scats, which meant that observing defecating animals will still be best practice in exposed locations. However, this is often not possible and other proxies may be required to determine sample freshness (e.g. odour, colour, consistency), as well as understanding how these may change between species, seasons and environments (Piggott 2004; Vynne et al. 2012; Demay et al. 2013).
Given the proportions of broad categories of DNA change as the sample ages, it is possible that the measured proportions from various diet species in the samples may change too if DNA from different species degrades at different rates. This should be examined with experimental studies and care taken to ensure consistent collection methods between sites.
Substrate type
Scats collected from rock and dirt had similar amplification success, but scats from rock had a higher proportion of food DNA detected. This is partially consistent with Oehm et al. (2011) who also found that carrion crow scat samples collected from dirt had reduced food DNA detectability, in both protected and exposed samples. However, they found that DNA detection was hampered by an increase in PCR inhibitors. This did not appear to affect the samples in our case study as amplification success was similar between dirt and rock samples. Instead, the presence of non‐food DNA was higher in scats obtained from dirt. Our scats were fresh, compared to 5‐day‐old scats from carrion crows; therefore, the DNA in our samples may not have been as degraded. Shy albatross scat samples from dirt contained a higher proportion of unicellular DNA than from rock. Unicellular eukaryotes are common in soil and these sequences are likely to represent contamination. It is often difficult to separate scat samples from dirt, especially for very liquid samples that have been mixed into the dirt. Seabird colonies can be home to greater densities of microbial communities within the soil (Wright et al. 2010), which may exacerbate the presence of non‐food DNA.
The three scats collected from plants were dominated by plant DNA. Any surface that contains DNA is likely to decrease the amount of food DNA due to increased contamination. An additional complication occurs when the substrate could be incorrectly assigned as a food item. This is particularly relevant for dietary studies on herbivore species when scats are collected from vegetation (Kartzinel et al. 2015), or marine species when scats are collected from the water (Jarman & Wilson 2004). If collecting from vegetation, the substrate species should be recorded to allow appropriate categorisation when interpreting sequencing results.
Breeding and developmental stage
Digestion rates are likely to vary for numerous reasons, for example predator species, metabolic rate, meal size, food type and feeding frequency (Hilton, Houston & Furness 1998). These may all in turn impact the detectability of food DNA in scats. Understanding how feeding behaviour may change throughout the year or breeding season for different developmental stages will impact how and when samples can be collected.
Collections from young animals are likely to pose problems for DNA dietary analysis depending on the way they obtain food. In this case study, young chicks had a lower proportion of food DNA detected than breeding adults and a higher proportion of bird DNA. In many avian species, juvenile food is delivered by regurgitation; therefore, food items are likely to be partially digested before they are fed to the chick. This was the case in white‐chinned petrels Procellaria aequinoctialis, where food in chicks’ stomachs was more digested than that of adults (Connan et al. 2007). Consequently, digestive processes may excessively degrade food DNA in chick scat samples. Additionally, there is presumably crossover of parental DNA to the chick during regurgitation, which may cause the amount of bird DNA to be inflated, thereby reducing the food DNA proportionately. Interestingly, the converse results were seen with Adélie penguins Pygoscelis adeliae, with scats collected from chicks more successful than those from breeders, especially during brood guard when chicks were small. Although a similar marker region was used in both studies, a blocking primer was used to suppress bird DNA amplification, which may explain this result (McInnes et al. 2016).
Scat samples from young vertebrates should ideally be collected when they are directly feeding on the food themselves, rather than through secondary means. For birds fed by regurgitation, this may not be possible during the nestling period; however, samples from older chicks did contain more food DNA. Older shy albatross chicks had a higher food proportion than small chicks, which may reflect larger meals or a reduction in stomach oil. Procellariiforme (albatross and petrel) stomachs contain oil that is obtained from digested prey (Imber 1976). This oily liquid can contribute up to 80% of the sample mass in some albatross stomachs (Thompson 1992). In shy albatross, there is a greater mass of oily liquid in younger chicks than older chicks (Hedd & Gales 2001), which may dilute the food DNA. Young animals with diet supplemented by suckling milk could also have the same issue.
We also observed differences in food detection between breeding cohorts, with lower proportions of food DNA and higher proportions of parasite DNA detectable from scats of non‐breeding animals during brood guard. A non‐breeder was identified by its presence at an empty nest and is likely to be either a failed breeder or subadult bird defending a nest. As these individuals do not need to forage to feed a chick, they may have been ashore longer and therefore could fall into a similar category to fasting/incubating birds. This finding highlights the need to understand not only the biology of the study species, but also awareness of which breeding cohorts may be present during scat collections and how this may affect results.
Fasting
The detection of food DNA throughout the season was strongly linked to fasting. Longer periods of fasting during incubation resulted in a low detection of food DNA in scats, whereas food DNA detection was much higher for breeding birds during brood. This is likely to be linked to more frequent feeding trips during this stage. During periods of fasting, non‐target DNA was dominated by endoparasites, rather than external contamination. Cestodes are the main endoparasites in pelagic seabirds, and their presence is largely driven by diet and the availability of intermediate hosts, for example zooplanktonic organisms and fish (Hoberg 1996). Interestingly, there was an apparent increase in parasite DNA during fasting. If the food DNA proportion alone had decreased, then it would be expected that all other DNA groups would increase proportionally. However, there appeared to be a greater increase in the parasite DNA than other groups, suggesting there was an increase in prevalence, not just detection. The exact cause of the increase during fasting is unknown; however, care should be taken when obtaining scats, targeting animals with minimal time since feeding.
Fasting periods occur in many species for many reasons, including territory defence, hibernation, meal availability, migration, incubating or suckling young, moult or limited mobility, for example during pregnancy. Understanding when these fasting periods occur in the study species is important for detection of dietary DNA in scats. Although defecation rate does slow during fasting, it often will not cease. Therefore, the risk of collecting scats that contain no dietary information needs to be taken into consideration when planning a study.
Field protocols for DNA scat collection
We have developed a method to allow high‐quality dietary information to be obtained using universal metazoan markers by optimising collection protocols, enabling a reduction in signal from non‐target DNA.
Careful planning of DNA dietary metabarcoding studies prior to sample collection is imperative for overall project success. Researchers should consider the dietary question they are targeting and focus on which scat samples will inform this. This includes marker selection, seasonal changes, fasting and the age of animals. These considerations, especially animal behaviour and developmental stage, are likely to be important to a broad array of molecular ecology studies reliant on DNA in scat samples, or those using eDNA. To improve the quality and quantity of dietary information obtained from scat samples, the following collection protocols should be followed when possible.
- Collect fresh scats where the animal is seen defecating. If this is not possible, try to collect only scats that still have moisture or develop species‐specific proxies that correlate to sample age.
- Give serious consideration to the scat substrate type, as contamination from substrate can overwhelm the food DNA signal. Ideally, collect scats from surfaces with minimal sources of DNA contamination (e.g. rocks or ice). If collecting from dirt or vegetation, try to minimise the collection of foreign material and record the substrate (and species where applicable) to cross‐check and validate results.
- Take into consideration, the seasonal behaviour and feeding ecology of the study animal prior to sample collection.
- Avoid collections from animals that may not have fed recently, such as periods of fasting.
- Collect from animals that are directly feeding themselves and avoid secondary feeding where possible (including suckling young). Samples from young animals that are being fed by regurgitation may be problematic due to partially digested food passed on by the parents or large amounts of parental DNA. For such species, collection from older animals may be preferable.
- If only a single collection is available and the seasonal timing and cohort are not the focus of the study, target the time period with the shortest time since feeding and focus on adult animals.
- If multiple study sites are used, keep collection protocols and timing consistent between sites
These optimised scat collection protocols provide a basis for future experimental designs and will enable ecologists to collect high‐quality diet samples and reduce non‐target DNA amplification. They also provide standardised field methods which will be important in this rapidly expanding area of research.
Authors’ contributions
Experimental design was carried out by J.M., S.J., R.A. and M.‐A.L. Sample collection was made by J.M. and R.A. Molecular marker testing was conducted by J.M. and S.J. Laboratory experiments were conducted by J.M. Bioinformatics was performed by J.M., S.J., B.D. Statistical analysis was performed by J.M. and B.R. All authors wrote the manuscript.
Acknowledgements
This project used University of Tasmania Animal Ethics Permit A13745 and Tasmanian DPIPWE Scientific Permits TFA 14049 and TFA 15081. Funding was provided by Australian Antarctic Science Grant (4014) the Winifred Violet Scott Charitable Trust. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 668981. Thanks to James Marthick and Menzies Institute (UTAS) for Miseq use; Kris Carlyon, Sam Thalmann (DPIPWE) and Alistair Hobday (CSIRO) for field assistance; Andrea Polanowski and Cassy Faux (AAD) for laboratory advice; and Simon Wotherspoon (UTAS) for statistical advice.
Data accessibility
Data are accessible from the Australian Antarctic Data Centre (doi:10.4225/15/57EDACF067ADF).




