Applications for deep learning in ecology
Abstract
- A lot of hype has recently been generated around deep learning, a novel group of artificial intelligence approaches able to break accuracy records in pattern recognition. Over the course of just a few years, deep learning has revolutionized several research fields such as bioinformatics and medicine with its flexibility and ability to process large and complex datasets. As ecological datasets are becoming larger and more complex, we believe these methods can be useful to ecologists as well.
- In this paper, we review existing implementations and show that deep learning has been used successfully to identify species, classify animal behaviour and estimate biodiversity in large datasets like camera-trap images, audio recordings and videos. We demonstrate that deep learning can be beneficial to most ecological disciplines, including applied contexts, such as management and conservation.
- We also identify common questions about how and when to use deep learning, such as what are the steps required to create a deep learning network, which tools are available to help, and what are the requirements in terms of data and computer power. We provide guidelines, recommendations and useful resources, including a reference flowchart to help ecologists get started with deep learning.
- We argue that at a time when automatic monitoring of populations and ecosystems generates a vast amount of data that cannot be effectively processed by humans anymore, deep learning could become a powerful reference tool for ecologists.
1 INTRODUCTION
Over the course of just a few years, deep learning, a branch of machine learning, has permeated into various science disciplines and everyday tasks. This artificial intelligence discipline has become increasingly popular thanks to its high flexibility and performance. Deep learning algorithms became popular in 2012 when they broke accuracy records in image classification (Krizhevsky, Sutskever, & Hinton, 2012) and speech recognition (Hinton et al., 2012). Since then, this technology has expanded rapidly, revolutionizing the way we use computer power to automatically detect specific features in data and to perform tasks such as classification, clustering or prediction (Olden, Lawler, & Poff, 2008). Applications for these tools now span scientific and technological fields as varied as medicine (e.g. Shen, Wu, & Suk, 2017), bioinformatics (e.g. Min, Lee, & Yoon, 2017) but also finance (e.g. Heaton, Witte, & Polson, 2016), or even video games (e.g. Lample & Chaplot, 2017).
Considering the complexity of ecological data and the ever-growing size of ecological datasets, a phenomenon recently amplified by the widespread use of automatic recorders (Rovero, Zimmermann, Berzi, & Meek, 2013), we believe that deep learning can be a key tool for many ecologists. Indeed, other machine learning approaches such as artificial neural networks (Lek et al., 1996), genetic algorithms (Stockwell & Noble, 1992), support vector machines (Drake, Randin, & Guisan, 2006) or random forests (Cutler et al., 2007) have been successfully used and documented in ecology in the past 20 years (Lek & Guegan, 2012; Olden et al., 2008; Recknagel, 2001). However, to our knowledge, we currently lack an insightful overview on when deep learning could be useful to ecologists. This review shows that the flexibility of deep learning can make it beneficial to most ecological disciplines, even in applied contexts, such as management and conservation. We identify common challenges and provide answers and resources to help ecologists decide whether deep learning is an appropriate method of analysis for their studies.
2 WHAT IS DEEP LEARNING?
To summarize what deep learning is, we first present its shared roots with machine learning. Machine learning in general refers to a category of algorithms that can automatically generate predictive models by detecting patterns in data. These tools are interesting for ecologists because they can analyse complex nonlinear data, with interactions and missing data, which are frequently encountered in ecology (Olden et al., 2008). Machine learning has already been successfully applied in ecology to perform tasks such as classification (Cutler et al., 2007), ecological modelling (Recknagel, 2001) or studying animal behaviour (Valletta, Torney, Kings, Thornton, & Madden, 2017). What makes deep learning algorithms different and so powerful resides in the way they can learn features from data.
First, learning can occur without supervision, where computers automatically discover patterns and similarities in unlabelled data. With this method, no specific output is expected, and this is often used as an exploratory tool to detect features in data, reduce its dimensions, or cluster groups of similar data (Valletta et al., 2017). Second, learning can also be done with supervised training. A labelled dataset with the target objects is first given to the computers so they can train to associate the labels to the examples. They can then recognize and identify these objects in other datasets (LeCun, Bengio, & Hinton, 2015). However, in conventional machine learning, providing only the labels is insufficient. The user also needs to specify in the algorithm what to look for (Olden et al., 2008). For instance to detect giraffes in images, the algorithm requires specific properties of giraffes (e.g. shape, colour, size, patterning) to be explicitly stated in terms of patterns of pixels. This can hamper non-specialists of machine learning because it usually requires a deep knowledge of the studied system and good programming skills. In contrast, deep learning methods skip such a step. Using general learning procedures, deep learning algorithms are able to automatically detect and extract features from data. This means that we only need to tell a deep learning algorithm whether a giraffe is present in a picture and, given enough examples, it will be able to figure out by itself what a giraffe looks like. Such an automated learning procedure is made possible by decomposing the data into multiple layers, each with different levels of abstraction, that allow the algorithm to learn complex features representing the data.
The ability to auto-detect features in complex, highly dimensional data, with high predictive accuracy is what led to the fast expansion and ubiquity of deep learning methods (LeCun et al., 2015). And research at numerous levels in ecology (from individual to meta-ecosystem scales) often furnishes the highly dimensional datasets with which deep learning is especially accurate and efficient.
In practice, there are multiple ways to achieve these results, with different deep learning architectures available (Box 1). Among them, the most widely used is the convolutional neural network (CNN), the architecture that helped popularize deep learning due to its performance in image classification (Krizhevsky et al., 2012). However, as numerous implementations have emerged (Chollet, 2016; He, Zhang, Ren, & Sun, 2016; Simonyan & Zisserman, 2014), and because better performance can usually be obtained by adapting the implementation to the problem to solve (Wäldchen & Mäder, 2018), the inner workings of each tool go beyond the scope of this review.
Box 1. Popular deep neural network architectures
From a technical standpoint, deep learning algorithms are multilayered neural networks. Neural networks are models that process information in a way inspired by biological processes, with highly interconnected processing units called neurons working together to solve problems (Olden et al., 2008) (Figure 1). Neural networks have three main parts: (a) an input layer that receives the data, (b) an output layer that gives the result of the model, and (c) the processing core that contains one or more hidden layers. What differentiates a conventional neural network from a deep one is the number of hidden layers, which represents the depth of the network. Unfortunately, there is no consensus on how many hidden layers are required to differentiate a shallow from a deep neural network (Schmidhuber, 2015).
During training, the network adjusts its behaviour in order to obtain the desired output. This is done by computing an error function by comparing the output of the model to the correct answer. The network then tries to minimize it by adjusting internal parameters of the function called weights, generally by using a process called gradient descent (LeCun et al., 2015).
Among deep networks, several structures can be found. Feedforward networks map an input of determined size (e.g. an image) to an output of a given size (e.g. a classification probability) by going through a fixed number of layers (LeCun et al., 2015). One of the feedforward implementations that received the most attention due to its ease of training and good generalization is the CNN. CNNs are designed to process multiple arrays of data such as colour images and generally consist of stacking groups of convolutional layers and pooling layers in a way that is inspired by biological visual systems (LeCun et al., 2015).
On the other hand, recurrent neural networks (RNN) usually have only one hidden layer but they process elements in sequence, one at a time and keep a memory of previous elements, with each output included in the input of the next element. The summation of each individual step can thus be seen as one very deep feedforward network. This makes them particularly interesting for sequential input such as speech or time series (LeCun et al., 2015). A popular implementation of RNN is the Long-Term Short-Memory network, an architecture capable of learning long-term dependencies that has proven especially efficient for tasks such as speech recognition (Fernández, Graves, & Schmidhuber, 2007) or translation (Sutskever, Vinyals, & Le, 2014).

3 OVERVIEW OF APPLICATIONS IN ECOLOGY
To identify areas where deep learning could be beneficial to ecologists, we performed a review of articles that use deep learning methods for ecological studies or that describe methods that could be used in ecological studies such as animal or plant identification or behavioural detection.
3.1 Review method
On December 13th, 2018, we interrogated four search engines—Web of Science, Science Direct, arxiv.org and bioRxiv—with the following keywords: (a) ‘deep learning’ AND algorithm; (b) ‘deep neural network’; (c) ‘convolutional neural network’; and (d) ‘recurrent neural network’ (see Box 1 for more information on deep learning implementations). When available, we restricted our search to categories relevant to ecology, otherwise we added ‘ecology’ to the search terms. To obtain the most up-to-date information, we included preprints since deep learning is still very recent and the publishing process can be long. We obtained 127 unique papers, 64 of which were deemed irrelevant and 24 were added after looking through the references. Overall, we therefore used 87 papers. The dominant implementation was CNNs (n = 64) used for image processing (n = 59) (Figure 2). Other popular uses include sound processing (n = 10) or modelling (n = 11).

The vast majority of the selected papers (n = 69, 78%) were published in 2017 or after, showing the recent surge in interest for the method (Figure 2). Four papers available online were already planned for publication in 2019 at the time of the literature search.
Deep learning methods have already obtained good results in a wide range of applications (Figure 3). The next sections provide examples of ecological disciplines that can benefit from such tools.

3.1.1 Identification and classification
With the advent of automatic monitoring, ecologists are now able to accumulate a large amount of data in a short amount of time. Data can be gathered from devices such as camera traps, sound recorders, smartphones or even drones (Gray et al., 2019; Knight et al., 2017; Wäldchen & Mäder, 2018). However, extracting relevant information from the large recorded datasets has become a bottleneck, as doing it manually is both tedious and time consuming (Norouzzadeh et al., 2018). Automating the analysis process to identify and classify the data has therefore become necessary, and deep learning methods have proven to be effective solutions. In fact, the LifeCLEF 2018 contest, an event that aims to evaluate the performance of state-of-the-art identification tools for biological data, received only submissions based on deep learning (Joly et al., 2018).
Convolutional neural networks have most commonly been used with images to identify and classify animals (Gomez Villa, Salazar, & Vargas, 2017; Norouzzadeh et al., 2018; Tabak et al., 2019) or plants (Barre, Stoever, Mueller, & Steinhage, 2017; Rzanny, Seeland, Wäldchen, & Mäder, 2017). They can even work with digitized images of herbaria (Younis et al., 2018), an asset for taxonomists. For more information on the subject, applications of deep learning in image-based identification have recently been reviewed by Wäldchen and Mäder (2018).
Deep learning can also be used with acoustic data such as bird songs (Knight et al., 2017; Potamitis, 2015; Salamon, Bello, Farnsworth, & Kelling, 2017), marine mammals vocalizations (Dugan, Clark, LeCun, & Van Parijs, 2016) and even mosquito sounds (Kiskin et al., 2018).
Other applications include phenotyping, that is classifying the visible characteristics of a species to link them to its genotype, such as counting leaves to assess the growth of a plant (Dobrescu, Giuffrida, & Tsaftaris, 2017) or monitoring the root systems of plants to study their development and their interaction with the soil (Douarre, Schielein, Frindel, Gerth, & Rousseau, 2016). While mainly used in agricultural research so far, these techniques could be translated to ecology, for example to study the productivity of an ecosystem or to measure the impacts of herbivory on plant communities.
3.1.2 Behavioural studies
Deep neural networks can automate the description of animal behaviour, thus proving valuable for ethological studies. For instance insight on the social behaviour of individuals has been gained by describing their body position and tracking their gaze (Pereira et al., 2019; Qiao et al., 2018; Turesson, Conceicao, & Ribeiro, 2016). Images from camera trapping have been successfully used to describe and classify wild animals’ activities such as feeding or resting (Norouzzadeh et al., 2018). Collective behaviour and social interactions of species such as bees can even be studied using CNNs to locate and identify marked individuals (Wild, Sixt, & Landgraf, 2018), thus opening the way to powerful capture-mark-recapture techniques applied to a wide set of species.
As telemetry datasets are growing bigger every day, deep learning can be used to detect activity patterns such as foraging. By training a CNN with GPS localizations coupled with time-depth recorder data used to detect the diving behaviour of seabirds, a research team has been able to predict diving activities from GPS data alone (Browning et al., 2017).
Models of animal behaviour can also be created. For instance by analysing videos of nematode worms Caenorhabditis elegans, a RNN was able to generate realistic simulations of worm behaviours. That model also doubled as a classification tool (Li, Javer, Keaveny, & Brown, 2017). Theoretical simulations of courtship rituals in monogamous species (Wachtmeister & Enquist, 2000) and of the evolution of species recognition in sympatric species (Ryan & Getz, 2000) have also been created.
3.1.3 Population monitoring
As deep learning is used to detect, identify and classify individuals in automatic monitoring data, such tools can be scaled up to help monitor populations. For instance population size can be estimated by counting individuals (Guirado, Tabik, Rivas, Alcaraz-Segura, & Herrera, 2018; Norouzzadeh et al., 2018). By extension, information such as population distribution or density can also be calculated from this data as it has already been done with traditional methods (Rovero et al., 2013).
Detecting symptoms of diseases is a large potential provided by deep learning, mirroring the existing applications in disciplines such as medicine (e.g. Shen et al., 2017). For instance CNNs have been used to detect tree defoliation or diseases in crops (Kalin, Lang, Hug, Gessler, & Wegner, 2018; Mohanty, Hughes, & Salathé, 2016). This technology could be widely applied to wild plant and animal populations to help find hints of scars, malnutrition or the presence of visible diseases.
3.1.4 Ecological modelling
Ecologists often require powerful and accurate predictive models to better understand complex processes or to provide forecasts in a gradually changing world. Machine learning methods have shown great promise in that regard (Olden et al., 2008), and deep learning methods are no exception. A deep neural network has recently been able to accurately create distribution models of species based on their ecological interactions with other species (Chen, Xue, Chen, Fink, & Gomes, 2016). With enough data, these methods could also become the avenue for studying ecological interactions (Desjardins-Proulx, Laigle, Poisot, & Gravel, 2017).
Deep networks have the potential to model the influence of environmental variables on living species even though they have not yet been applied in this way. Studies in the medical field managed to predict gastrointestinal morbidity in humans from pollutants in the environment (Song, Zheng, Xue, Sheng, & Zhao, 2017), a method that could easily be transferable to wild animals. Recurrent networks have also been shown to successfully predict abundance and community dynamics based on environmental variables for phytoplankton (Jeong, Joo, Kim, Ha, & Recknagel, 2001) and benthic communities (Chon, Kwak, Park, Kim, & Kim, 2001). Overall, with such potential in predicting species distribution from environmental factors, this means that deep learning could become part of the toolbox for ecological niche models.
3.1.5 Ecosystem management and conservation
With human activities affecting all ecosystems, a major task for ecologists has been to monitor and understand these ecosystems and their changes for management and conservation purposes (Ellis, 2015). We argue here that deep learning tools are appropriate methods to fulfil such aims. For instance biodiversity in a given site can be estimated via the identification of species sampled in automatic recordings (Salamon et al., 2017; Villon et al., 2018). The timing of species presence in any given site can also be measured with time labels tailored to species life cycles (Norouzzadeh et al., 2018). The functioning and stability of ecosystems can then be monitored by converting all these species data and interactions into food web models and/or focusing on indicator species such as bats, which are very sensitive to habitat and climate changes (Mac Aodha et al., 2018). And the importance of ecosystem services can be assessed to help decision makers with their policies or management decisions (Lee, Seo, Koellner, & Lautenbach, 2019).
Deep learning is also perfect for performing landscape analysis for large-scale monitoring. To monitor coral reefs, CNNs have been trained to quantify the percent cover for key benthic substrates from high-resolution images (Beijbom et al., 2015). Events that modify the landscape such as cotton blooms are detectable using convolutional networks and aerial images (Xu et al., 2018). Furthermore, by combining satellite imaging, LIDAR data and a multi-layer neural network, the above-ground carbon density was quantified in order to define areas of high conservation value in forests on the island of Borneo (Asner et al., 2018).
Beyond mapping species and areas of high value for ecosystems and conservation, deep learning has a large set of potential applications to track the impacts of human activities. Recently, deep neural networks mapped the footprint of fisheries using tracking information from industrial fishing vessels (Kroodsma et al., 2018). Also, in order to reduce illegal trafficking, it has been suggested to use deep learning algorithms to monitor such activities on social media to automatically detect pictures of illegal wildlife products (Di Minin, Fink, Tenkanen, & Hiippala, 2018). Using deep learning for data mining could easily be extended to other areas, as social media mining has proven to be useful for ecological research such as phenological studies (Hart, Carpenter, Hlustik-Smith, Reed, & Goodenough, 2018).
To go even further, deep learning has already been envisioned as a cornerstone in a fully automated system for managing ecosystems, using automated sensors, drones and robots. Such systems would allow continuous ecosystem management without requiring much human intervention (Cantrell, Martin, & Ellis, 2017).
4 IMPLEMENTING DEEP LEARNING: CHALLENGES AND GUIDELINES
While deep learning methods are powerful and promising for ecologists, these tools have requirements that need to be considered before deciding to implement them. In this section, we identify common questions that often arise when dabbling in deep learning waters. We also provide guidelines and suggestions to help ecologists decide when deep learning would be beneficial to their studies. However, since this section does not aim to be exhaustive, a good practice is to consult or collaborate with computer scientists before using deep learning, in the same way one would consult a statistician before designing a study.
4.1 Machine learning versus deep learning: which one to choose?
Two of the most common questions encountered is why use deep learning instead of ‘traditional’ machine learning and how is it different. The main difference with other methods lies in the way features are extracted from data. With traditional machine learning algorithms, feature extraction requires human supervision, whereas deep learning tools can learn by themselves very complex representations of data due to their multilayered nature. They are therefore easier to use when the users have limited knowledge about the features to detect. The record-breaking accuracy results achieved in identification and classification tasks (e.g. Krizhevsky et al., 2012; Joly et al., 2018) also leads to one of the main reasons to use deep learning: performance. However, these results depend on the existence of a sizeable labelled dataset that can be used to train the algorithms to extract the desired features from the data. The training process can be more time consuming and require a lot more computer power than traditional methods. Deep learning is thus especially appropriate when analysing large amounts of data, and it performs particularly well for complex tasks such as image classification or speech/sound recognition.
4.2 How to create a deep learning model?
- Select an architecture—the structure and composition of the layers of the network—depending on the type of data to use (see section 4.3)
- Select a framework—the set of tools used to implement the architecture—to create the model (see section 4.4)
- Implement the model. This usually implies either using a pre-existing model offered by the framework or manually coding each layer of the model.
- Acquire a training dataset (see sections 4.5 and 4.6).
- Train the model with a training subset that usually represents 70%–80% of the whole training dataset.
- Test the accuracy of the model by running the model on a validation subset (the remaining 20%–30% of the training dataset). As the result of the model is compared to a known value, the performance of the model can then be assessed by calculating metrics such as the precision—the proportion of correct results in all positive results—and the recall—the proportion of correctly identified answers—of the model. Another popular metric used is the F1-score, which is the harmonic mean of precision and recall.
- Refine the model if needed. This part can easily become quite technical as improving a model can entail actions such as simply changing the initialization parameters, getting more data, completely changing the architecture of the model or customizing each layer. The extent to which a model should be refined heavily depends on its expected accuracy. If improving the performance is really needed, we recommend consulting with a computer scientist as this is a research domain in itself.

In addition, we provide some references and links to useful tutorials and resources in supporting information 2.
4.3 Which architecture should i choose?
This question is difficult to answer as this is very data-driven and there is no single do-it all solution. Here we define architecture as the structure of a deep network. Architectures can vary in many ways such as the number of layers, their composition, their order, the type of functions used, etc. The performance of an architecture is conditioned by the type of data provided and task at hand. For exploratory analyses, an unsupervised approach is preferable. Most of the emphasis until now has been put on feedforward networks such as CNNs, so these networks are currently the recommended approach for identification and classification tasks. However, RNNs might be more appropriate to analyse sequential data such as time series (Box 1). As deep learning is nearly ubiquitous in all research fields, looking for prior studies and publications outside ecology can also help select an appropriate implementation for a specific task.
Even inside a family of architectures, some implementations are more popular than others because of their performance, or because of the innovations they brought. As each model can offer different results depending on the task and the training dataset, finding the best performing model might require testing several of them and comparing their performance (Norouzzadeh et al., 2018). To facilitate this, most deep learning frameworks include out-of-the-box implementations of these architectures and even ready-to-use pretrained models (Table 1).
Framework | Language | ONNX Support | Implementation of popular networks | Pre-trained models available | URL |
---|---|---|---|---|---|
Tensorflow | Python, C/C++, R, Java, Go, Julia | Needs conversion from external tool | Yes | Yes | https://www.tensorflow.org/ |
PyTorch | Python | Yes | Yes | Yes | https://pytorch.org/ |
Kerasa | Python, R | Needs conversion from external tool | Yes | Yes | https://keras.io/ |
Microsoft Cognitive Toolkit (CNTK) | C#, C++, Python | Yes | Yes | Yes | https://docs.microsoft.com/en-us/cognitive-toolkit/ |
Deeplearning4J | Java, Scala | Basic support | Yes | Yes | https://deeplearning4j.org/ |
MATLAB + Deep Learning Toolbox | MATLAB | Yes | Yes | Yes | https://www.mathworks.com/products/deep-learning.html |
Apache MXNET | C++, Python, Julia, Matlab, JavaScript, Go, R, Scala, Perl | Yes | Yes | Yes | http://mxnet.incubator.apache.org/ |
PlaidML | Python | Yes | Yes (via keras) | Yes (via keras) | https://github.com/plaidml/plaidml |
- a Keras is actually a high-level interface that works on top of other frameworks such as Tensorflow, CNTK or PlaidML.
Note that most CNN architectures are targeted towards image classification tasks. For other tasks, such as sound classification, creating a custom network might be more appropriate (Browning et al., 2017; Fairbrass et al., 2019; Pereira et al., 2019). An alternative approach is to convert the original data into images and treat the problem as an image classification one, as it has recently been done with bird song classification (Sevilla, Bessonne, & Glotin, 2017).
4.4 Which framework should i choose?
With the rapid development of deep learning, a great number of libraries and packages, and even whole artificial intelligence ecosystems, have been created to set up deep networks with minimal effort. We listed some of the most popular frameworks in Table 1. Most of the popular tools are open source and packages are available in multiple programming languages. Python seems to be the most popular programming language for deep learning at the moment. All these frameworks offer different compromises in terms of ease-of-use, resources available, architecture support, customizability or hardware support. Therefore, a lot of consideration should be put into selecting the tools that cater the most to one's needs. In the eventuality that the user decides to change frameworks, new standards such as the Open Neural Network Exchange format (ONNX) (https://onnx.ai) have emerged. This allows for an architecture created on one framework to be easily used in another framework, thus offering a greater interoperability between tools. However, the training process will still have to be repeated as each framework reads and writes the results of training differently.
4.5 How much data are necessary?
Perhaps the biggest challenge for supervised deep learning lies in the need for a large training dataset to achieve high accuracy. As algorithms are trained by examples, they can only detect what they were previously shown. Therefore, training datasets often contain thousands to millions of examples, depending on the task, the number of items to detect and the desired performance (Marcus, 2018). Good results have been obtained with smaller training datasets with only several hundred examples per class (Abrams et al., 2018; Fairbrass et al., 2019; Guirado et al., 2018) opening the approach for most fields of ecology. Yet, overall, the bigger the training dataset, the better the classification accuracy will be (Marcus, 2018). It is, however, important to make sure that each classification category has enough examples to avoid identification biases (Wearn, Freeman, & Jacoby, 2019).
Also note that the training dataset is usually split in two subsets, one to effectively train the model, and the other one to assess its performance. It is therefore essential that each subset contains enough examples that are representative of all classification categories to allow for efficient training.
This need for data also implies that the dataset we want to analyse should have a sufficient size and that finding the right threshold of size is critical. For instance in acoustic processing, at least 36 hr of recording were required for a deep learning algorithm to become more efficient than human listening (Knight et al., 2017).
4.6 What to do if I do not have enough data?
Creating a labelled training dataset from scratch could be a long and tedious task. To help alleviate the need for data-hungry training examples, multiple solutions have appeared in recent years and are readily available. Here we present some of the most popular.
4.6.1 Public datasets
Public annotated databases can increasingly be found online in order to facilitate the training of deep neural networks in ecology. Some of them include millions of bird sounds, like the Macaulay (https://www.macaulaylibrary.org/) or Xeno-Canto (https://www.xeno-canto.org/) libraries, bat calls (Mac Aodha et al., 2018), plants (Giuffrida, Scharr, & Tsaftaris, 2017) or animal images (Swanson et al., 2015). More generalist reference databases are also available to pretrain neural networks such as MNIST (http://yann.lecun.com/exdb/mnist/) or ImageNet (http://image-net.org/).
As scientists are increasingly required to make their research data available, training datasets will become easier to come by in the near future; and the recent surge in data repositories will facilitate data-hungry analyses. Some journals such as Scientific Data even focus solely on the publication of large research datasets (Candela, Castelli, Manghi, & Tani, 2015).
4.6.2 Crowd sourcing
Manual identification can also be outsourced to others thanks to citizen science. Using platforms such as Zooniverse (https://www.zooniverse.org), it is possible to create projects asking people to help label datasets. While variations in observer quality can be concerning, their impact can be reduced by identifying unreliable participants and entries and then filtering them out by modifying the analysis criteria (Dickinson, Zuckerberg, & Bonter, 2010). This approach has therefore been successfully used in several projects and will probably grow with time (Mac Aodha et al., 2018; Swanson et al., 2015).
4.6.3 Transfer learning
Transfer learning is a method that can help reduce the required size of the training dataset (Schneider, Taylor, & Kremer, 2018). It consists of pre-training a model to detect specific features tailored to the type of data to process on a large dataset with similar characteristics. For instance a user who wants to detect objects in pictures, but has a limited annotated set, can first train the model on a large public image dataset, even when the images are unrelated. The model can learn to detect features like edges or colours (Schneider et al., 2018), and can then be trained on the smaller dataset containing the objects to recognize. To save time, it is even possible to directly download the results of pretraining on large public image datasets for some popular implementations of CNN (Schneider et al., 2018). While the model still needs to be retrained on examples more closely associated to the question at hand, this can help lessen the size of the dataset needed to achieve good performance.
4.6.4 Data augmentation
Another way to help get enough data is data augmentation. Data augmentation consists in the artificial generation of training data from annotated samples. For instance with sound recordings, noise can be added, or the sound distorted. Images can be flipped, rotated or their colours can be altered. This allows not only to feed a greater variety of data to the model but also to provide enough for efficient training. Deep learning itself can even be used to generate realistic datasets for training with methods such as generative adversarial networks (Goodfellow et al., 2014). This method has been applied to successfully generate plant images (Giuffrida et al., 2017) and bee markers (Wild et al., 2018).
4.7 How much computer power is needed?
Training on very large datasets can require a lot of computing power. To effectively train a deep learning algorithm, it will need to learn millions of parameters (Chollet, 2016). To achieve that, powerful hardware resources are needed. In fact, the recent explosion in deep learning has been made possible due to the technological advancement in computer hardware and especially the use of graphics processing units (GPU) found in graphic cards (Schmidhuber, 2015). The good news is that training a deep learning algorithm can technically be done on any recent hardware, allowing any ecologists with a reasonably powerful laptop to do it. However, good graphics cards can speed up the training time by orders of magnitude (Schmidhuber, 2015). Even then, training the model can take several days to converge for very complex analyses, and fine-tuning to improve accuracy might require several training sessions (Chollet, 2016; Knight et al., 2017). Nevertheless, once the desired accuracy is achieved, the converged model can be saved and reused repeatedly. As inference computations are generally fast and capable of going through large datasets efficiently compared with other alternative approaches, time savings can thus be gained on the long-term (Knight et al., 2017).
At the moment of writing, most frameworks are optimized to offer GPU acceleration for cards from only one manufacturer which dominates the market, Nvidia. However, a lot of effort is being made to support different types of hardware and graphic cards, with some compilers such as plaidML (Table 1) that are designed to improve performance on any hardware and operating system.
4.8 Can I reuse any model for my problem?
A common problem with deep learning is that it has limited potential for solving a task it was not designed and trained for (Marcus, 2018). For instance if we design an acoustic recognizer to identify a particular species from its calls, it might have a hard time recognizing the calls of taxonomically distant species. Therefore, caution should be exercised before considering reusing an existing deep learning model. At the moment, the easiest way to solve this would be to increase the training dataset size to include samples of other species of interest.
5 CONCLUDING REMARKS
Deep learning, just like other machine learning algorithms, provides useful methods to analyse nonlinear data with complex interactions and can therefore be useful for ecological studies. Where deep learning algorithms truly shine is in their ability to automatically detect objects of interest in data, such as identifying animals in photographic images, just by showing them examples of what to look for. Moreover, they can do that with great accuracy, making them tools of choice for identification and classification tasks. While the emphasis has so far been on supervised methods due to their performance and ease of training, future developments in unsupervised learning are expected, thus potentially removing the need for annotated datasets altogether (LeCun et al., 2015).
Deep learning shows a lot of promise for ecologists. Although the methods are still very new, implementations are already covering a wide array of ecological questions (Figure 3) and can prove very useful tools for managers, conservationists or decision makers by providing a fast, objective and reliable way to analyse huge amounts of monitoring data. Applications can also go beyond ecology and deep learning could also be valuable in the field of evolution and in biology in general. However, developing a deep learning solution is not yet a trivial task and ecologists do need to take time to evaluate whether this is the right tool for the job. Requirements in terms of training datasets, training time, development complexity and computing power are all aspects that should be considered before going down the deep learning path.
As ecology enters the realm of big data, the reliance on artificial intelligence to analyse data will become more and more common. Ecologists will then have to acquire or have access to good programming and/or mathematical skills and tools. While this might seem scary at first, we believe that there is one simple solution to this challenge: collaboration across disciplines (Carey et al., 2019). A stronger interaction between computer scientists and ecologists could also lead to new synergies and approaches in data classification and analyses, providing new insights for fundamental and applied research in ecology. This in turn would allow ecologists to focus on the ecological questions rather than on the technical aspects of data analysis, and computer scientists to pave new roads on some of the biological world's most complex units, such as ecosystems. As many others have before, we also strongly encourage sharing datasets and code whenever possible to make ecological research faster, easier and directly replicable in the future, especially when using complex tools such as deep learning (Lowndes et al., 2017; Wilson et al., 2017). With software getting more powerful and easier to use, experience being accumulated and shared resources such as datasets made available to everyone, we believe that deep learning could become an accessible and powerful reference tool for ecologists.
ACKNOWLEDGEMENTS
The study was funded by the Canada Research Chair in Polar and Boreal Ecology, the New Brunswick Innovation Fund and Polar Knowledge Canada. The authors declare no conflict of interests. We thank Tommy O'Neill Sanger for proof-reading our manuscript.
AUTHORS' CONTRIBUTIONS
S.C. and N.L. had the original idea for the study and designed the research. S.C., N.L. and E.H. collected the review information and carried out the analyses. S.C. and N.L. wrote the first drafts of the manuscript with input from E.H. All authors discussed the results, implications and edited the manuscript.
DATA AVAILABILITY STATEMENT
The whole list of surveyed papers can be found at https://figshare.com/s/9810c182268244c5d4b2 (Christin, Hervet, & Lecomte, 2019).