Volume 10, Issue 12
REVIEW
Open Access

Towards an ecological trait‐data standard

Florian D. Schneider

Corresponding Author

E-mail address: florian.dirk.schneider@gmail.com

unaffiliated, c/o Birgitta König‐Ries, Department of Mathematics and Computer Science, Friedrich‐Schiller‐Universität Jena, Jena, Germany

Correspondence

Florian D. Schneider

Email: florian.dirk.schneider@gmail.com

Search for more papers by this author
David Fichtmueller

Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin, Berlin, Germany

Search for more papers by this author
Martin M. Gossner

Forest Entomology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland

Search for more papers by this author
Anton Güntsch

Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin, Berlin, Germany

Search for more papers by this author
Malte Jochum

Institute of Plant Sciences, University of Bern, Bern, Switzerland

German Centre for Integrative Biodiversity Research (iDiv) Halle‐Jena‐Leipzig, Leipzig, Germany

Institute of Biology, Leipzig University, Leipzig, Germany

Search for more papers by this author
Birgitta König‐Ries

Department of Mathematics and Computer Science, Friedrich‐Schiller‐Universität Jena, Jena, Germany

Search for more papers by this author
Gaëtane Le Provost

Senckenberg Biodiversity and Climate Research Centre (BiK‐F), Frankfurt am Main, Germany

Search for more papers by this author
Peter Manning

Senckenberg Biodiversity and Climate Research Centre (BiK‐F), Frankfurt am Main, Germany

Search for more papers by this author
Andreas Ostrowski

Department of Mathematics and Computer Science, Friedrich‐Schiller‐Universität Jena, Jena, Germany

Search for more papers by this author
Caterina Penone

Institute of Plant Sciences, University of Bern, Bern, Switzerland

Search for more papers by this author
Nadja K. Simons

Department of Ecology and Ecosystem Management, Technische Universität München, Freising, Germany

Ecological Networks, Department of Biology, Technische Universität Darmstadt, Darmstadt, Germany

Search for more papers by this author
First published: 19 August 2019
Citations: 9

Abstract

  1. Trait‐based approaches are widespread throughout ecological research as they offer great potential to achieve a general understanding of a wide range of ecological and evolutionary mechanisms. Accordingly, a wealth of trait data is available for many organism groups, but this data is underexploited due to a lack of standardization and heterogeneity in data formats and definitions.
  2. We review current initiatives and structures developed for standardizing trait data and discuss the importance of standardization for trait data hosted in distributed open‐access repositories.
  3. In order to facilitate the standardization and harmonization of distributed trait datasets by data providers and data users, we propose a standardized vocabulary that can be used for storing and sharing ecological trait data. We discuss potential incentives and challenges for the wide adoption of such a standard by data providers.
  4. The use of a standard vocabulary allows for trait datasets from heterogeneous sources to be aggregated more easily into compilations and facilitates the creation of interfaces between software tools for trait‐data handling and analysis. By aiding decentralized trait‐data standardization, our vocabulary may ease data integration and use of trait data for a broader ecological research community and enable global syntheses across a wide range of taxa and ecosystems.

1 INTRODUCTION

Functional traits are phenotypic (i.e. morphological, physiological, behavioural) characteristics that are related to the fitness and performance of an organism (McGill, Enquist, Weiher, & Westoby, 2006; Violle et al., 2007). Recent years have seen a proliferation of trait‐based research in a wide range of fields: trait data have been used to understand the evolutionary basis of individual‐level properties (Salguero‐Gómez et al., 2016), global patterns of biodiversity (Díaz et al., 2016), and the relationship between ecosystem functions and the functional composition of species assemblages (Bello et al., 2010; Mouillot, Graham, Villéger, Mason, & Bellwood, 2013). This research provides the mechanistic framework for linking climate change or anthropogenic land use to biodiversity and its related functions (Allan et al., 2015; Díaz et al., 2011; Lavorel & Grigulis, 2012). Species traits have been suggested as indicator variables for monitoring ecosystem health at the individual level, like for instance changes in body sizes in a population of fish (Kissling et al., 2018). Because functional traits allow us to infer the ecological role of organisms from their apparent features, regardless of their taxonomic identity (Grime, 2001; Moretti et al., 2017; Villéger, Brosse, Mouchet, Mouillot, & Vanni, 2017), their measurement is also a promising means of bypassing taxonomic impediment, i.e. the fact that most species are yet undescribed, and little is known of their interactions with other organisms and their environment.

Despite the importance of trait‐based approaches, fully exploiting their potential relies heavily on the broad availability and compatibility of trait data to achieve sufficient taxonomic and regional coverage, both of present‐day taxa as well as in evolutionary deep‐time. However, the heterogeneity of data arising from different research contexts render trait data extremely heterogeneous and make the task of data compilation time‐consuming and error‐prone. To date, trait data have traditionally been harmonized and compiled into centralized databases only for specific organism groups and regional scope, often centred around particular research questions (e.g. PanTHERIA, Jones et al., 2009; TRY, Kattge, Díaz, et al., 2011; AmphiBio, Oliveira, São‐Pedro, Santos‐Barrera, Penone, & Costa, 2017). Less well‐studied taxa and specialized research questions lack the resources for such an endeavour. Besides initiatives aiming at assembling data, tools to enable the compatibility of data across databases are being developed. These include software to access trait data from the Internet (e.g. Ankenbrand, Hohlfeld, Weber, Foerster, & Keller, 2018; Chamberlain, Foster, Bartomeus, LeBauer, & Harris, 2017), semantic web standards (Page, 2008; Wieczorek et al., 2012) and thesauri of consensus terms (Garnier et al., 2017; Walls et al., 2012).

Meanwhile, open and reproducible science has become mainstream: publication of research data without access restrictions, with structured metadata and in accordance with data standards to enable their reuse, has become the declared goal of an open biodiversity knowledge management (http://www.bouchoutdeclaration.org/) and is increasingly demanded by journals and public research funding agencies (Alliance of German Science Organisations, 2010; Royal Society Science Policy Centre, 2012). As a result, an increasing number of individual research projects publish their primary data on general‐purpose file hosting services, where no data standards are enforced upon the uploaded material (Wilkinson et al., 2016). It is thus likely that trait data will become increasingly available, but a lack of data and metadata standardization will hamper the efficient reuse and synthesis of published datasets.

In this paper, we review existing initiatives for trait‐data collection and standardization from the pragmatic view of data providers, data curators and data users, as well as data managers. We discuss current efforts to make trait data visible, accessible, interoperable and reuseable in downstream data analysis, as demanded by the FAIR guiding principles for scientific data (Wilkinson et al., 2016). Furthermore, we show how the current deficit in the standardization of primary data hampers the implementation of interoperability and reuse of trait data. Based on these considerations, we propose a versatile vocabulary for describing ecological trait datasets, which builds upon, and is compatible with, existing terminology standards for biodiversity data, in particular the Darwin Core Standard for biodiversity data (DwC; Wieczorek et al., 2012). Since a standard vocabulary relies on the adoption by a broad research community, we discuss incentives for its use and lay out mechanisms for future consensus‐building and community development towards an accessible and easy‐to‐use ecological trait‐data standard vocabulary.

2 INITIATIVES FOR TRAIT‐DATA STANDARDIZATION

The need for standardizing trait data arises from the prospective gain of compiling heterogeneous trait datasets for data synthesis. Often, the scientific scope and focus differs between data providers measuring and assessing the trait data in the first place and data users who reuse published data for a broader synthesis application. Furthermore, data curators and data managers are taking up the task of providing compiled and harmonized data and prepare them for future use and long‐term preservation. Data managers are concerned with the development of complex digital infrastructures for handling and analysing large amounts of data. These are idealized roles of researchers that are dealing with trait‐data standardization throughout the data life cycle. In this chapter, we review four types of initiatives that are of relevance for trait‐data standardization (see Glossary in Table 1 for italicized terms):
  1. Initiatives that provide trait datasets which have been assembled out of a particular research interest, either by measurement or collated from the literature.
  2. Initiatives that aim to harmonize trait data from the literature or from direct measurements into data compilations or database infrastructures and make those data widely available on the Internet.
  3. Initiatives that aim at the standardization and development of consensus measurement methods and definitions for traits and provide standard terminologies.
  4. Initiatives that aim to combine data (1 & 2) and terminologies (3) into formalized structures for knowledge representation to link trait data to a wider set of biodiversity data.
Table 1. Glossary of terms from the biodiversity data‐management context as they are used in this paper; draws from Garnier et al. (2017)
Term Definition
Concept An idea, notion or object that is made explicit in an information context by a term definition, and referenced to a URI or other accessible reference
Controlled vocabulary A list of terms that gives all valid consensus terms for a particular context, while no unlisted entries are accepted
Darwin Core Standard (DwC) Body of terms intended to facilitate the sharing of information about biological diversity; maintained by the Biodiversity Information Standards TDWG (http://rs.tdwg.org/dwc/)
Dataset A set of measurements and observations, often stored in a data‐table and originating from a single experimental set‐up or study context; can be considered as being internally homogeneous across all data entries
Database A structured collection of data, usually organized as multiple data tables linked via identifiers into relational databases; usually constructed using a specific database management system, i.e. a software to provide a (offline or online) user interface
File repository A storage or archiving of datasets on file‐hosting services like Figshare.com, Dryad (datadryad.org), Researchgate.net, or Zenodo.org; online repositories make data available for public access, provide metadata, state conditions of reuse, and (not always) facilitate citations via persistent identifiers, e.g. DOIs (Digital Object Identifiers)
Identifier (ID) A unique label that relates data entries to information within and across datasets or external items of information; may be used to connect multiple data tables into a database; can be user‐specific or, in form of a URI, point to a globally valid ontology or thesaurus
Metadata Data documentation of the higher‐level information or instructions; describe the content, context, quality, structure, provenance and accessibility of a data object (Michener, 2006). In the context of trait data, such additional information can move to the body of the primary data table when data are compiled from different sources
Occurrence The observation context of a single individual, i.e. the existence of an organism at a particular place and time; Sometimes used as synonym of ‘observation’ in data management context
Ontology A semantic model of the objects and their relationships in a domain of interest (Gruber, 1995); defines terms and concepts in a formal language that provides cross‐references and semantic meaning; commonly published in OWL format for machine readability
Semantic web An extension of the World Wide Web that aims for machine‐readable meaning of information via well‐defined data standards, ontologies and exchange protocols (Berners‐Lee et al., 2001); the World Wide Web Consortium (W3C) defines standards, i.e. specifications of protocols and technologies for the semantic web (http://www.w3.org/standards/semanticweb/)
Term A word that names or labels a particular concept as part of the specialized vocabulary of a field.
Terminology The body of terms and concepts used with a particular application in a subject of study, usually formalized in a thesaurus or ontology
Thesaurus Controlled vocabulary that provides key terms with their associated concepts and relations for a specific field or domain of interest (Laporte et al., 2013); e.g. may define a hierarchy of broader or narrower terms
Uniform Resource Identifier (URI) An unambiguous pointer to a unique resource on the Internet; used to refer to single terms of a thesaurus or ontology; Example: http://purl.obolibrary.org/obo/TO_0000391

We consider these initiatives separately although they are often developed in conjunction to serve a particular database project, such as the TRY plant database (Kattge, Díaz, et al., 2011; Kattge, Ogle, et al., 2011) and the Thesaurus of Plant characteristics (TOP; Garnier et al., 2017). We show how the degree of trait‐data standardization in existing datasets is highly variable, and which tools and standards are currently applied to achieve harmonization of data from multiple, distributed sources. The objective of this review is to raise awareness of the generic structure of trait data and aid researchers in how to share and publish their own datasets in an appropriate form.

2.1 Trait datasets

In the field of comparative biology, morphological traits, such as traits related to flower shape, leaf and stem structures for plants or wing and beak measurements for birds, as well as life‐history traits such as Ellenberg values for plants or physiological and reproductive traits for animals (e.g. feeding biology, dispersal, metabolic rate and body size) have been assessed for decades and have been published in regular journal articles or books. With the rise of ecological trait‐based research, measurements and information available from species descriptions have been compiled into project‐specific datasets that typically comprise a local set of taxa and a focal set of traits. A plethora of such static datasets has been published alongside scientific articles, or as standalone data publications (see Kleyer et al., 2008 for a review on plant data; for animal data, e.g. Gossner et al., 2015 and Appendix S1, Table A1).

Today, the online publication of such data is greatly facilitated by file hosting services (e.g. Figshare, Zenodo, Researchgate, Data Dryad), which warrant long‐term accessibility, and citeability via DOIs, and govern data sharing via license statements. These platforms offer the hosting of publicly accessible file repositories at low‐cost or for free, which makes them attractive for small and intermediate‐sized research projects that cannot dedicate extra resources for data management. Most importantly, these platforms enable public hosting of data with very low quality‐thresholds regarding metadata documentation and data standardization. Thus, although open for download, the trait datasets on such data repositories might be stored in variable tabular structures and labelled following self‐defined terms, which makes extraction and further use unnecessarily tedious.

For trait data, there are common issues arising from the variability of data structures and metadata quality. In terms of structure, trait data usually are reported in a species × traits wide‐table format. In this intuitive data table, each row represents a species (or taxon) for which multiple traits are reported in columns. Similarly, when reporting raw data, researchers place observations on individual organisms in rows with multiple trait measurements applied to the same individual across multiple columns. Covariates on the taxon, the individual specimen (e.g. sex or life‐stage) or context of observation (e.g. time and place of sampling) would be placed in additional columns and would further expand the two‐dimensional data table. The resolution or scope of these covariates varies greatly depending on the research question and observation context. The column descriptions and terminology applied to taxa and traits are mostly project‐specific and rarely chosen for compatibility with larger database initiatives. Variability in the number and meaning of columns in these data tables requires tedious manual adjustments when merging multiple datasets (Wickham, 2014). Furthermore, metadata provided along with the primary data vary in their level of detail, e.g. for documenting descriptions of variables, measurement procedures or sampling context (Kattge, Ogle, et al., 2011). While, in some datasets, information like geolocation or sampling date and time might be dataset‐level information, thus qualifying as metadata, in other datasets they might be collected on a level of individual observations (see section on data compilations below). More importantly, clear statements on ownership and authorship, terms of use, or internationalization (e.g. separators and delimiters), are often still neglected in primary trait‐data publications. The task of harmonizing trait data is taken up by data‐curating initiatives, who compile heterogeneous data into comprehensive databases (see next section).

2.2 Data compilation initiatives

In the past two decades, many distributed trait datasets have been aggregated and harmonized into greater collections with particular taxonomic or regional focus (e.g. Kleyer et al., 2008; Oliveira et al., 2017, see Appendix S1, Table A1). While these initiatives successfully address issues of heterogeneity in units or categorical variables, or achieve high taxonomic or geographic coverage, few of these compilations apply a standardized terminology for taxa or trait definitions. Additionally, in the process of data aggregation, rich metadata content might be lost, as the detail in the original files differs, while the reference to the original dataset becomes obscured, as only aggregated values are reported (e.g. means or medians). Such trait–data compilations are often labelled ‘database’, although they do not formally provide data in a database structure in the strict data‐management sense. Instead, the data are released as static data tables of raw measurements or aggregate trait values on journal websites or open‐access file hosting platforms, which may be updated irregularly.

As they deal with much larger amounts of data, initiatives that compile data from natural history museum collections are traditionally more concerned with standardization. The amount of morphological measurements data extracted from museum collections and herbaria is likely to skyrocket in the near future due to digitization efforts supported by new technology for scanning and pattern recognition (Smith & Blagoderov, 2012, and references therein; Ströbel, Schmelzle, Blüthgen, & Heethoff, 2018) and citizen science initiatives (e.g. www.markmybird.org). For example, the VertNet database compiled and harmonized large quantities of vertebrate trait data from collections; the resulting data are published as versioned data tables which are updated as new data sources become available (http://vertnet.org, Guralnick et al., 2016).

Specialized online portals have been created to attract data submissions from a defined research field and take care of data harmonization, thereby greatly facilitating data synthesis. For example, by aiming for a universal framework for plant traits, the TRY database (Kattge, Díaz, et al., 2011) attracted more data submissions and downloads than any other trait‐data platform. The online portal enables selective data download and management of user permissions. For animal trait data, however, a single unified platform and harmonizing scheme is still lacking. Nonetheless, initiatives for particular groups of animals do exist. Examples are the BETSI database on soil invertebrate traits (http://betsi.cesab.org/; Pey et al., 2014), the Carabids.org web portal (http://www.carabids.org/), the Coral Trait Database (Madin et al., 2016), or the Global Ants Database (Parr et al., 2017, see Appendix S1, Table A1). The role of online portals and database initiatives in standardizing data and making them more accessible is paramount. Trait‐data portals incentivize data submissions by offering increased data visibility and usage, while providing data‐use policies that secure author attribution and, potentially, co‐authorship of associated articles. However, maintaining centralized database infrastructures is costly and requires long‐term funding (Bach et al., 2012).

2.3 Terminology standards for traits

A major challenge in trait‐data standardization is the lack of widely accepted and unambiguous trait definitions (Kissling et al., 2018). Previous standard definitions of trait concepts range from listings of selected definitions in vocabularies, over well‐defined method handbooks and comprehensive thesauri, to formalized definitions of trait concepts in ontologies. The initiatives behind method handbooks, thesauri and ontologies are essential for building community consensus for trait definitions.

Very general classes of traits are defined within the list of GeoBON Essential Biodiversity Variables (Kissling et al., 2018) aiming for a list of functional indicators for ecosystem health.

Assigning a detailed and unambiguous methodological protocol for a trait, including the units to use or the ordinal or factor levels to be assigned, is essential for standardizing its measurement process. Efforts to develop handbooks for measurement protocols provide such a methodological standardization for plants (Cornelissen et al., 2003; Perez‐Harguindeguy et al., 2013) or invertebrates (Moretti et al., 2017), but are of limited use in harmonizing trait data that pre‐date or ignore this standard (Kattge, Ogle, et al., 2011).

A thesaurus provides a ‘controlled vocabulary designed to clarify the definition and structuring of key terms and associated concepts in a specific discipline’ (Garnier et al., 2017; Laporte, Garnier, & Mougenot, 2013). To provide a logic structure for trait terms, Garnier et al. (2017) suggest the Entity‐Quality model (EQ), where a trait is defined as ‘an entity having a quality’ (for instance for trait ‘femur length’, ‘femur’ is the entity and ‘length’ the quality). In thesauri, hierarchies of concepts can be formalized by linking each term to broader or narrower terms, or to synonyms. For example, the definition of ‘femur length of first leg, left side’ is narrower than ‘femur length’ which is narrower than ‘leg trait’ which is narrower than ‘locomotion trait’. Being publicly available, it is also possible to refer to these defined terms via globally unique Uniform Resource Identifiers (URIs). For example, a measurement of fruit mass could be linked to the definition of the term within the Thesaurus of Plant characteristics (TOP, Garnier et al., 2017) via its URI ‘http://top-thesaurus.org/annotationInfo?viz=1&&trait=Fruit_mass’.

In addition to defining terms for human interpretation, ontologies define terms by their relationship to other defined terms, thereby providing a semantic model of the concepts used within a domain of research, with the objective of enabling the computational interpretation of data (Kissling et al., 2018; Walls et al., 2012, 2014). The Plant Trait Ontology (TO) definition of the concept ‘seed size’ contains references to other globally defined terms: ‘A seed morphology trait (TO:0000184) which is the size of a seed (PO:0009010)’. Thus, trait definitions may refer to related terms or synonyms defined in other trait ontologies or other scientific ontologies, like units as defined by the Units of Measurement Ontology (Gkoutos, Schofield, & Hoehndorf, 2012). By providing ontologies in a formalized syntax, like Web Ontology Language (OWL), a machine‐readable web of definitions is spun across the Internet allowing researchers and search engines to relate independent trait measurements with each other and connect them to the wider semantic web of online data (Berners‐Lee, Hendler, & Lassila, 2001; Gruber, 1995; Page, 2008; Walls et al., 2012).

Comprehensive trait thesauri have been developed in TOP (which is employed in the TRY database, Garnier et al., 2017) and in the Thesaurus for Soil Invertebrate Trait‐based Approaches (T‐SITA, http://t-sita.cesab.org/, Pey et al., 2014). Ontologies of trait definitions have been developed for plants (e.g. the Plant Ontology, Jaiswal et al., 2005; Walls et al., 2012; the Flora Phenotype Ontology, Hoehndorf et al., 2016), and for specific animal taxa (e.g. the Hymenoptera Anatomy Ontology, Yoder, Miko, Seltmann, Bertone, & Deans, 2010; the Vertebrate Trait Ontology, Park et al., 2013). The UBERON ontology is an integrated cross‐species anatomy ontology for all animals, which combines concepts from different existing ontologies, with wide application in biomedical or physiological research (Mungall, Torniai, Gkoutos, Lewis, & Haendel, 2012).

To conclude, there is already a suite of globally available thesauri and ontologies for traits. However, definitions in some domains are better covered than others (Kissling et al., 2018), and different curation strategies and measures for peer‐review and community building are employed. To this end, the OBO Foundry is providing a development platform for (biological) ontologies and offers review and quality control (Smith et al., 2007, http://www.obofoundry.org/). While defined vocabularies are increasingly used in biodiversity data management, distributed trait data of smaller projects published in general‐purpose file servers rarely refer to standard terminologies. Finding and applying the most suited and highest quality ontology from the range of available ontologies is not an easy task for ecological researchers. To mitigate this effort, meta‐ontology initiatives, like Ontobee (http://www.ontobee.org/), Bioportal (https://bioportal.bioontology.org/, Whetzel et al., 2011), or the GFBio Terminology Service (Karam et al., 2016, https://terminologies.gfbio.org/), provide centralized hosting for trait ontologies, structured browsing, and harmonized web services for computational access.

2.4 Trait‐data structures

While trait thesauri and trait ontologies typically define concepts of measurements and observations for focal groups of organisms, they do not specify the format or structure in which trait data should be stored and labelled.

A trait dataset typically contains multiple data entries, where each entry describes a trait value observed on an instance of a scientific taxon. The item on which the value has been observed can be very variable, ranging from an occurrence of an individual at a specific place and time in its natural environment or a preserved specimen in a collection (Figure 1a), a group of individuals of a specific taxon (Figure 1b), or an entire population of a species (Figure 1c,d). The reported trait values may be quantitative measurements or qualitative facts. Quantitative measurements are values obtained either by direct morphological, physiological or behavioural observations on single specimens (Figure 1a), by aggregating replicated measurements on multiple entities (Figure 1b) or by estimating the means or ranges for the respective taxon as reported in the literature or other published sources (e.g. databases, Figure 1c). This encompasses a wide range of numeric data types, including continuous, binary, integer, intervals or ratios, as well as categorical (ordinal or nominal) values. Qualitative facts are assignments of categorical information, often on entire taxa, e.g. of a behavioural or life‐history trait (Figure 1d).

image
Types of ecological trait data assume different entities or reported qualities: (a) morphometric or morphological measurements of individual body features (lengths, areas, volumes, weights) or other quantities related to life history (e.g. reproductive rates, life spans); (b) aggregated trait values are reported as means taken on multiple measures of organisms of a taxon; (c) quantitative traits may be extracted from literature or existing databases, referring to the entire taxon (or a subset, e.g. a sex) as the subject of description; (d) qualitative traits are categorical, ordinal or binary descriptors of the entire species or higher taxonomic level (also called ‘facts’)

Beyond these core observations, further information might be available that specify the taxon concept applied, provide detail on the measurement method, or that place the reported measurement in a broader observation context (including geolocation as well as date and time of sampling). As such data may be useful for future analysis of the causal reasons of trait variation or to explain noise in measurement data, it should always be published along with the core data. In most cases, information on place and time apply to the entire dataset, and thus would be included in the metadata accompanying a data publication (potentially applying Ecological Metadata Language, EML, KNB, 2011 as a formal structure). In the case of trait data and depending on the research scope, the information may also have been collected on a level of measurement, occurrence or taxon level. Geolocation or date and time would then not be provided as metadata, but as covariate data in additional columns of the primary dataset. When compiling datasets, it is a key task of data curators to deal with dataset‐level information and maintain it for downstream analysis by incorporating it into the compiled data table.

Standard terms for the formal description of the common concepts of biodiversity knowledge have been provided in the schema for biological collection records (Access to Biological Collection Data, ABCD; Holetschek, Dröge, Güntsch, & Berendsohn, 2012) or the Darwin Core Standard for biodiversity data (DwC; Wieczorek et al., 2012). Both DwC and ABCD are ratified standards of the Biodiversity Information Standards (TDWG, http://www.tdwg.org) which is a global network to support the development and wide adoption of exchange standards for biodiversity data. These terms may be used for defining columns in data tables that contain measurement values, units and categorical levels, taxon names, variables such as sex or life stage, information of time and date of observation and methodological details (Robertson, Döring, Wieczorek, DeGiovanni, & Vieglais, 2009). A suite of terminology extensions links to and expands the capacities of DwC (Wieczorek et al., 2012). Of particular importance for trait data is the ‘MeasurementOrFact’ extension, which typically would be used in database management and bioinformatics to structure trait observations (Parr et al., 2016).

While the above‐mentioned standards provide terms and concept definitions, and the logic relationships of those, they do not prescribe explicit structure for trait data. Based on the terms of DwC, the Extensible Observation Ontology (OBOE, Madin et al., 2007; Schildhauer et al., 2016) formalizes observations and measurements into a machine‐readable ontology, thus being easily integrated into larger database management systems. By applying this scheme for plant traits, Kattge, Ogle, et al. (2011) propose a generic database structure that covers most potential use cases of trait‐based ecology. This data structure is built around a central data table that contains observations of individual plants linked to several measurements of traits via identifiers. The observations are also linked to a taxonomy and metadata descriptors of the observation context, like location or experimental treatment. Kissling et al. (2018) discuss different ontologies (including OBOE) that formalize the structure of observation data and attest that for the use cases of trait data these ontologies are still difficult to integrate.

The Encyclopedia of Life (EOL) has proposed TraitBank (Parr et al., 2016) as a standard structure for uploading data on physiological and life‐history traits of all kingdoms of life. It is to date the most general approach of an integrated structure for trait data. The framework employs established terms provided by the DwC and the DwC MeasurementOrFact extension (Parr et al., 2016). Additional layers of information cover bibliographic references, multimedia archives and ecological interactions. TraitBank invites data submissions to the EOL database in a structured Darwin Core Archive (DwC‐A, GBIF, 2017), which is a set of simple text files (csv), a file to specify relationships between these text files (called meta.xml), and a file for metadata descriptions using EML (called, EML.xml, see GBIF, 2017 for specifications, archives can be validated before upload on https://tools.gbif.org/dwca-validator/).

All of these structures suggest the use of stable URIs to refer to taxon concepts. The difficulties with keeping taxonomic references intact along with continuous changes in taxonomy consensus are a central challenge of biodiversity data management and are beyond the scope of this review (Franz et al., 2016). Initiatives that aim at providing a stable reference while tracking the changing taxon concepts are for instance the Catalogue of Life (https://www.catalogueoflife.org/) or the EDIT Platform for Cybertaxonomy (https://cybertaxonomy.eu/). The GBIF Backbone Taxonomy (GBIF Secretariat, 2017) collects and bundles existing terminologies into a single reference framework.

2.5 Closing gaps to improve trait‐data reuse

In sum, we attest to a gap between the trait‐data structures developed for data curators and data managers and the data input produced by data providers. Hardly any of the aforementioned standalone or aggregated trait datasets for birds, amphibians, mammals or invertebrates employs the described standard terminologies, ontologies or data standards. As it stands, reusing these data in larger compilations or integrating them into structured database initiatives is error‐prone and labour‐intensive and the potential for a broad synthesis is diminished.

One likely reason for this lack of standardization is the complexity of the task: the proposed data structures are designed for multi‐layered, relational databases rather than for standalone datasets for which a two‐dimensional data table may suffice. In the eyes of the data‐provider, in most cases, any co‐variate can be appended as extra columns to the dataset. The other reason is lack of awareness of the need for trait‐data standardization among data providers, who are not trained in the demands of biodiversity data‐management. In addition, complying with what may be non‐intuitive data structures is an investment without clear incentive or immediate pay‐off, and hardly affordable for small and intermediate‐size research projects, especially since funders often do not require these efforts to be included into proposals.

By filling this gap, data‐brokering services (the German Federation for Biological Data; http://gfbio.org, Diepenbroek et al., 2014; e.g. Data Observation Network for Earth, DataONE, Michener et al., 2011) or data management systems for scientific projects (e.g. KNB and its open‐source database back‐end Metacat, https://knb.ecoinformatics.org/; Diversity Workbench, http://diversityworkbench.net; BEXIS2, http://bexis2.uni-jena.de/) are likely to gain importance. These services simplify and direct the standardized upload of research data and descriptive metadata into reliable and interlinked data infrastructures. The goal of such initiatives is to facilitate data reuse by providing standardization of data, for instance by mapping to unambiguous terminologies and ontologies for biodiversity data and clarifying conditions of data reuse.

Another solution for data users to access trait data in a structured way is offered by decentralized tools and tool chains to facilitate the use and analysis of trait data. For instance, the r‐package traits (Chamberlain et al., 2017) contains functions to extract trait data directly from their source, including Birdlife, EOL TraitBank or BetyDB. The package tr8 provides similar access to plant traits from a list of databases (including LEDA, BiolFlor and Ellenberg values; Bocci, 2015) and aggregates them into a species × traits wide‐table. FENNEC (Ankenbrand et al., 2018) is an online tool or self‐hosted service capable of extracting trait information from multiple sources for a target species community.

A more widespread implementation of ontologies would advance the possibilities to integrate datasets and reduce noise and uncertainty when aggregating data. First, groups of trait researchers must take up the task of developing consensus definitions into semantically defined ontologies that are useful for their use case. Platforms like OBO Foundry can help structuring this process. Second, the reference to ontologies and thesauri must be incentivized and facilitated for individual data providers by the development of tools for matching concepts from the available ontologies to their data. Third, frameworks for providing trait data in an unambiguous and machine‐readable structure must be simplified to match the limited resources of small and intermediate research projects. This can be achieved by extending documentation or providing tools for the application of existing ontology frameworks and database structures (e.g. data validator services), and by defining easy‐to‐use standard vocabularies that enable the interoperability of data at minimal effort.

However, no unified and widely adopted terminology for primary trait‐data publications has emerged across the multiple sub‐disciplines of trait‐based research. In the following chapter, we propose a unified vocabulary for trait data that can serve as a minimal consensus for describing and labelling trait data. The simplicity of this standard terminology will lower the thresholds and offer high pay‐off in the visibility and reuse of published data. By establishing this as a ‘best‐practice’ in trait‐based research, trait data will eventually fulfil the FAIR guiding principles for scientific data (Wilkinson et al., 2016).

3 INTRODUCING THE ECOLOGICAL TRAIT‐DATA STANDARD VOCABULARY

As a response to the challenges outlined above, we propose a versatile standard vocabulary for trait‐based ecological research. The Ecological Trait‐data Standard Vocabulary (ETS) is accessible at https://terminologies.gfbio.org/terms/ets/pages/ and combines terms of DwC with newly defined terms to cover the variety of trait‐based approaches and their different needs to report measurement detail. Rather than prescribing a data structure or exchange format, the vocabulary is intended as a more inclusive terminology that can be used in three major use cases:
  1. by data providers: for publication of standardized primary data on open‐access data repositories, or for labelling project‐specific data for local use and exchange with collaborators, e.g. in two‐dimensional data tables or project databases,
  2. by data users and data curators: as a consensus vocabulary when compiling data from distributed sources into aggregate datasets, e.g. to map standardized columns and refer to taxa and trait definitions in a uniform way, and
  3. by data managers: in developing data exchange formats between online resources, web services and software tools, e.g. when providing database queries via a web service or defining input and output formats of software packages.

All terms may be applied to describe columns of a data table (Figure 2; see Appendix S2 for best‐practice principles and examples for publishing primary data). By applying these standard terms, data providers can ensure that the description of trait measurements uploaded into public data repositories will be unambiguous. It will facilitate interoperability of published data and enable their reuse for future data aggregation initiatives and data synthesis, while warranting long‐term accessibility.

image
Formats used for trait datasets: (a) taxon‐level trait data compiled from literature or aggregated from measurements are often published as a compiled species × traits wide‐table; (b) observation long‐tables are a well‐defined and tidy data format, reporting one single measurement per row and relating it to a standard trait definition and accepted taxon name; (c) additional columns may provide original names for maintaining author‐side continuity, identifiers reference to taxa and trait concepts via unambiguous URI pointers. Additional identifiers relate each row to other layers of information on (d) the taxon resolution, the individual organism (i.e. occurrence), or the origin of or confidence in the reported measurement or fact

The definitions of terms are hosted on the GFBio Terminology Service (Karam et al., 2016, https://terminologies.gfbio.org/), providing permanent and redirectable individual URIs and URLs for each term. The service can be accessed programmatically (i.e. via the API; https://terminologies.gfbio.org/api/terminologies/).

Our vocabulary offers three extensions to contain additional information on the context of the observation along with the core data in analogy to DwC extensions (‘Taxon’, ‘Measurement or Fact’, and ‘Occurrence’; see section on extensions below). Further terms are provided for dealing with typical dataset‐level information on authorship and rights of reuse of the data (based on terms of Dublin Core Metadata Initiative, DCMI), as well as for defining own trait concepts (see section on metadata below). Aspects not covered by the vocabulary may draw from terms provided by other existing terminologies (in particular DCMI and DwC and its extensions), or be added as user‐defined columns (which should then be clearly specified in the metadata‐information accompanying the dataset).

3.1 Building community consensus

In designing this vocabulary, we drew on the combined expertise of empirical biodiversity researchers (data providers), biodiversity synthesis researchers (data users), and biodiversity informatics researchers (data managers). The aim was to develop a simple, easy‐to‐use template for standalone trait‐data publications or data compilations, to facilitate their reuse for synthesis and integration into larger database structures. Earlier proposals for trait‐data standards (e.g. Kattge, Ogle, et al., 2011; Parr et al., 2016) have been designed for relational database structures from a data manager perspective, which may be the reason why they have so far hardly been adopted for primary data publications. We paid particular attention to these existing data standards (e.g. Garnier et al., 2017; Kattge, Díaz, et al., 2011; Kattge, Ogle, et al., 2011; Madin et al., 2007; Parr et al., 2016) to maximize compatibility.

Nonetheless, we are aware of the diverse use‐cases of trait data that might not yet be covered by the current version of the vocabulary. The version presented here is a mere starting point of a community effort towards a consolidated and comprehensive Ecological Trait‐data Standard Vocabulary, as a key resource for trait‐data standardization in ecological research. For future development of the vocabulary, we will engage with a broader community of trait researchers, in particular via the Open Traits Network (http://opentraits.org, Gallagher et al., 2019), and work towards full compatibility with other initiatives of biodiversity data standardization by collaborating with Biodiversity Information Standards TDWG (Taxonomic Databases Working Group, http://www.tdwg.org). This will also link our initiative to other trait‐based research fields, like biomedical and agricultural research. We invite communities of all trait‐based research fields to discuss, revise and submit terms and extensions of the vocabulary (coordinated via Github Issues at https://github.com/EcologicalTraitData/ETS/issues). The standard vocabulary will be released in subsequent versions and published as a stable reference on the GFBio Terminology Service.

3.2 Specification of core terms

To qualify as trait data complying with the ETS, the following content is required at minimum (Figure 2b):
  1. a value (column traitValue) and – for numeric values – a standard unit (traitUnit);
  2. a descriptive trait name (traitName) that links the observation to a standardized definition (i.e. a concept);
  3. the scientific taxon name (scientificName) for which the measurement or fact was obtained that links the observation to an accepted taxon concept.

The traitName and scientificName would use unambiguous terms assigning both to clearly defined concepts. Eventually, disambiguation can be warranted by adding globally valid Uniform Resource Identifiers (URIs) for taxon (taxonID) and trait definitions (traitID). For example, referring to GBIF Backbone Terminology, for Bellis perennis, the taxonID would be ‘https://www.gbif.org/species/3117424’; the traitID for ‘fruit mass’ according to Flora Phenotype Ontology would be ‘http://purl.obolibrary.org/obo/FLOPO_0005265’. Wherever possible, the field traitID should point to an unambiguous trait definition in a published ontology. If no suitable reference exists, trait data should always be accompanied by a dataset‐specific listing of trait concepts. Such a controlled vocabulary would, in its simplest form, assign trait names with an unambiguous definition of the trait and an expected format of measured values or reported facts (e.g. units or legit factor levels). Ideally, this definition refers to or refines terms from published trait ontologies. By providing a minimal vocabulary for trait lists within the ETS, we hope to facilitate the unambiguous definition of traits for trait datasets. This vocabulary might also prove useful for the future publication of trait ontologies.

To ensure compatibility with project‐specific databases or analytical code, it might be in the interest of the data author to keep user‐specific identifiers for those terms, for which we are suggesting the use of verbatimScientificName and verbatimTraitName (Figure 2c). By allowing user‐side entries along with consensus terms, we acknowledge the fact that most authors have their own schemes for standardization which may refer to different scientific community standards (as also practiced in TRY, Kattge, Díaz, et al., 2011; Kattge, Ogle, et al., 2011). The redundancy of labelling allows for continuity for data providers while also enabling quality checks and comparability for data curators.

Similarly, standardization of units can be achieved by relying on SI base units or by relating units to unambiguous concepts via URIs provided by ontologies (Gkoutos et al., 2012; Keil & Schindler, 2018; Madin et al., 2007). For categorical or binary traits, the categories should conform to expected levels as defined in the trait concept or be unambiguously defined in the metadata of the dataset. The vocabulary offers terms for keeping the user‐defined values in dataset‐specific units and factor levels along with standardized entries (verbatimTraitValue and verbatimTraitUnit, Figure 2c).

3.3 Extensions for additional data layers

Beyond measurement units or higher taxon information, further information might complement the core data which are related to the individual specimen, the reported fact, measurement or sampling event. We propose three extensions of the vocabulary that should be used to describe this information (Figure 2d), in line with the existing DwC extension structure:
  1. The Taxon extension provides further terms for specifying the taxonomic resolution of the observation and to ensure the correct reference in case of synonyms and homonyms.
  2. The MeasurementOrFact extension provides terms to describe information at the level of single measurements or reported facts, such as the original literature reference for the reported value, the method of measurement or statistical method of aggregation. It provides important information that allows for the tracking of potential sources of noise or bias in measured data (e.g. variation in measurement method) or aggregated values (e.g. statistical method), as well as the source of reported facts (e.g. literature source or expert reference).
  3. The Occurrence extension contains vocabulary to describe information on the observation context of individual organisms, such as sex, life stage or age. This also includes the method of sampling and preservation, as well as the date and geographical location, which provide an important resource to analyse trait variation due to differences in space and time.

These additional layers of information can either be added as extra columns to the core dataset or kept in separate data sheets, thus avoiding redundancy and duplication of content. A unique identifier links to these other datasheets, encoding single measurements or reported facts (measurementID) or individual organisms of a species (occurrenceID).

The concept of ‘occurrence’ is prone to cause confusion. By definition of DwC it is ‘An existence of an Organism at a particular place at a particular time’. Thus, any individual observed twice would have two distinct ‘occurrences’. If sampling of an individual is only performed once, this results in any occurrence being semantically identical with the individual organism (i.e. the DwC term ‘organism’). Some data types directly refer to existing global identifiers for occurrence IDs, e.g. a GBIF URI or a stable identifier references the precise specimen at a particular place and time from which the measurement was taken (Groom, Hyam, & Güntsch, 2017; Güntsch et al., 2017). Also, as ‘occurrence’ is strictly defined by a date‐time event, it may be identical to the common‐sense concept of ‘observation’. As such, data entries for location of sampling (provided in column locationID) and sampling campaigns (eventID), which are often recorded and published along with trait data, are tightly linked to the concept of ‘occurrence’. As occurrence is the narrower term and the key concept for linking an individual organism to a location and sampling event in DwC, and since it is indeed relevant to distinguish between multiple ‘occurrences’ of the same organism in some trait‐based research applications, the ETS sticks to this terminology.

Identifiers can also be used to provide a structure within the measurement data table, e.g. to link rows of measurements on the same individual (by having entries share the same ID in column occurrenceID). Similarly, the values of multivariate measurements can be linked by using the same measurementID for several rows.

The terms of the extensions draw from terms of the DwC extensions of particular relevance for trait data. See the documentation of the ETS for further detail on the use of extensions.

3.4 Specification of metadata

Dataset‐level information about structure, provenance of data, authorship and data ownership, as well as terms of use should be considered when sharing and working with trait datasets (Kissling et al., 2018; Michener, 2006). In the case of primary measurement data, this information usually applies to the entire trait dataset, and would be stored along with the published data as metadata entered in a template provided by the file hosting service. To facilitate interoperability and computational evaluation of metadata, specific standards for metadata may be provided, e.g. by applying Ecological Metadata Language (EML, KNB, 2011). Whenever data from different sources are compiled into a single dataset, metadata information would become part of the resulting data table, as each data entry would have to maintain reference to the original data provider and conditions of reuse of these data. This can be achieved by appending the metadata terms as columns to the core dataset, or by linking to a secondary data table via an unambiguous datasetID (e.g. a URI pointing to the source DOI) and a descriptive datasetName (e.g. a descriptive name for the source). The ETS metadata vocabulary provides terms for a minimal set of information that should be provided along with trait data. The suggested terms originate from Dublin Core Metadata Initiative (DCMI), and are widely compatible with terms provided by the DataCite Metadata Schema (DataCite Metadata Working Group, 2019). The terms can be extended and complemented by using terms from these resources.

In order to ensure traceability, the metadata of any dataset that employs the ETS should refer to the specific online version that was used to build the dataset, e.g. by entering ‘Schneider, F.D., Jochum, M., Le Provost, G., Penone, C., Ostrowski, A. and Simons, N.K., 2019, Ecological Traitdata Standard Vocabulary v0.10, https://doi.org/10.5281/zenodo.2605377, URL: https://terminologies.gfbio.org/terms/ets/pages/’ in the metadata field conformsTo. Wherever referring to individual terms of the vocabulary in publications or metadata, this should be done via their individual URIs.

4 DISCUSSION

To serve the demand for the standardization and harmonization of ecological trait data which has arisen from a growing number of distributed datasets of different research contexts, we propose a versatile vocabulary for the publication of new datasets, for the creation of data compilations, and for the exchange and handling of trait data in the context of the semantic web.

Consensus building on how traits are to be used and evaluated is currently under way in several fields of ecological research with their taxonomic focus and project‐specific questions (Garnier et al., 2017; Kissling et al., 2018; Moretti et al., 2017; Pey et al., 2014). Such community discussions on trait definitions and measurement practices are leading to a better quality of data, naturally. However, they still require a stronger linkage into the global biodiversity data initiatives. With our proposal of an Ecological Trait‐data Standard Vocabulary (ETS), we aim to capture the common core concept of trait data in a single resource terminology and provide a starting point for the development of a joint language and terminology around traits as a cross‐sectoral topic of ecological and evolutionary research. To enable the ETS to capture the different approaches in trait‐based research across fields, we invite researchers to contribute to future versions of the standard vocabulary and develop their own applications and ontologies that interact with it. Development will also aim at linking the initiative to the joint efforts for biodiversity data terminologies, in particular within Biodiversity Information Standards (TDWG).

Data released according to consensus standards, especially if published under open‐access licenses, are more easily reused in compilations and synthesis studies. By providing the ETS, an easy‐to‐use vocabulary for trait‐based research, the investment of time and resources in trait‐data standardization before publication will be mitigated for individual researchers and small research projects. A well‐defined minimal vocabulary for metadata will also ensure that authorship and terms of use are appropriately documented along the data life cycle. However, for these incentives to take effect, data publications and data citations must become viewed as a valid scientific contribution to the community and recognized in the professional evaluation of individual researchers (Costello, 2009; Roche, Kruuk, Lanfear, & Binning, 2015).

At the community level, shifting the task of standardization from the data‐user side to the data‐owner side yields great gain in accuracy and reduces the risk of misinterpretation. For instance, measurement results depend very much on the precise methodology used and often systematic biases could be corrected for when providing an unambiguous definition. On the other hand, plausibility checks and evaluation of statistical methods, e.g. for aggregating trait values to the species level, can only be done in comparison across a wide array of datasets. Currently, these ‘big data’ volumes are only available in centralized databases. However, to establish a best practice of data aggregation, an exploration and evaluation of different methods for quality assessment and quality control should be subject to a community discussion. This is only possible with large quantities of distributed data being available in a harmonized way. The ETS facilitates such a community‐driven comparison.

Without clearly defined terms and concepts, handling of large amounts of trait data by computational assistance systems for scientific analysis (‘e‐Science’) will be massively hampered (Wilkinson et al., 2016). The ETS represents an important building block for a unified mode to ease data exchange between web services and software packages and thus facilitates the development of a software toolchain for the trait‐data lifecycle. Having well‐defined terms is also a key precondition for developing exchange formats between large database initiatives and biodiversity data archives. Even further downstream, readying the primary data for the semantic web via references to ontologies and data standards will ease the application of automatized big‐data mining and machine‐learning techniques.

5 CONCLUSION

To date, there is a rich, distributed body of independently published trait datasets, each with a specific focus on particular organism groups, ecosystem types or regions. These distributed data are heterogeneous in form and description, hampering endeavours to harmonize, compile and analyse these data.

Using a standard vocabulary with globally accessible definitions of terms would allow distributed trait data to be more easily reused and harmonized into aggregated datasets. The biggest challenge in future standardization of trait data may be consensus building for standard terms, the establishment of incentives and the development of tools for a user‐side standardization before the publication of data. This requires significant effort, but it returns great scientific benefit by enabling data‐heavy synthesis for a general understanding of biodiversity and ecosystem functioning.

ACKNOWLEDGEMENTS

Thanks to all respondents to an internal online survey on trait data for the Biodiversity Exploratories project and to Diana Bowler, Klaus Birkhofer, Runa Boeddinghaus, Markus Fischer, Jens Kattge (and the TRY Steering Commitee), Felicitas Löffler, Catrin Westphal and two anonymous reviewers for comments on the manuscript drafts and pre‐print, as well as the Ecological Trait‐data Standard vocabulary. We are grateful to the organizers and participants of the Open Traits workshop in New Orleans, USA, in August 2018. We thank the past and present scientific coordinators, local managers and data managers of the Biodiversity Exploratories program for their work, and Markus Fischer, Eduard Linsenmair, Dominik Hessenmöller, Daniel Prati, Ingo Schöning, François Buscot, Ernst‐Detlef Schulze, Wolfgang W. Weisser and the late Elisabeth Kalko for their role in setting up the Biodiversity Exploratories program. The work has been partly funded by the DFG Priority Program 1374 ‘Infrastructure‐Biodiversity‐Exploratories’ (DFG‐Refno. Po362/18‐3, MA7144/1‐1, WE3081/21‐1, KO2209/12‐2); MMG obtained funding from Swiss National Science Foundation (SNF 310030E‐173542/1); MJ was supported by the German Research Foundation within the framework of the Jena Experiment (FOR 1451) and by the Swiss National Science Foundation.

    AUTHORS’ CONTRIBUTIONS

    F.D.S., A.O., C.P. and N.K.S. conceived the idea and developed the vocabulary for the trait‐data standard with significant contributions of M.J. and G.L.P.; C.P. and F.D.S. curated the living spreadsheet; A.G. and D.F. implemented the vocabulary in the GFBio terminology service; all authors contributed critically to the structure and content of the manuscript and gave final approval for publication.

    DATA AVAILABILITY STATEMENT

    The online reference for the Ecological Trait‐data Standard Vocabulary described in this paper is (Schneider et al., 2019). Any future development of the vocabulary is coordinated via https://github.com/EcologicalTraitData/ETS/.

      Number of times cited according to CrossRef: 9

      • Reliability analysis of fish traits reveals discrepancies among databases, Freshwater Biology, 10.1111/fwb.13469, 65, 5, (863-877), (2020).
      • Trait-Based Assessments of Climate-Change Impacts on Interacting Species, Trends in Ecology & Evolution, 10.1016/j.tree.2019.12.010, (2020).
      • Ecological specialization and population trends in European breeding birds, Global Ecology and Conservation, 10.1016/j.gecco.2020.e00996, (e00996), (2020).
      • Dos and don'ts when inferring assembly rules from diversity patterns, Global Ecology and Biogeography, 10.1111/geb.13098, 29, 7, (1212-1229), (2020).
      • Avian trait specialization is negatively associated with urban tolerance, Oikos, 10.1111/oik.07356, 129, 10, (1541-1551), (2020).
      • Open Science principles for accelerating trait-based science across the Tree of Life, Nature Ecology & Evolution, 10.1038/s41559-020-1109-6, (2020).
      • Patterns in research and data sharing for the study of form and function in caviomorph rodents, Journal of Mammalogy, 10.1093/jmammal/gyaa002, (2020).
      • Towards a New Generation of Trait-Flexible Vegetation Models, Trends in Ecology & Evolution, 10.1016/j.tree.2019.11.006, (2019).
      • Network-Based Biomonitoring: Exploring Freshwater Food Webs With Stable Isotope Analysis and DNA Metabarcoding, Frontiers in Ecology and Evolution, 10.3389/fevo.2019.00395, 7, (2019).