Volume 10, Issue 5
APPLICATION
Open Access

bmotif: A package for motif analyses of bipartite networks

Benno I. Simmons

Corresponding Author

E-mail address: benno.simmons@gmail.com

Conservation Science Group, Department of Zoology, University of Cambridge, Cambridge, UK

Correspondence

Benno I. Simmons

Email: benno.simmons@gmail.com

and

Riccardo Di Clemente

Email: r.diclemente@ucl.ac.uk

Search for more papers by this author
Michelle J. M. Sweering

Conservation Science Group, Department of Zoology, University of Cambridge, Cambridge, UK

Faculty of Mathematics, Cambridge, UK

Search for more papers by this author
Maybritt Schillinger

Conservation Science Group, Department of Zoology, University of Cambridge, Cambridge, UK

Faculty of Mathematics, Cambridge, UK

Search for more papers by this author
Lynn V. Dicks

School of Biological Sciences, University of East Anglia, Norwich, UK

Search for more papers by this author
William J. Sutherland

Conservation Science Group, Department of Zoology, University of Cambridge, Cambridge, UK

Search for more papers by this author
Riccardo Di Clemente

Corresponding Author

E-mail address: r.diclemente@ucl.ac.uk

Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts

Centre for Advanced Spatial Analysis (CASA), University College London, London, UK

Correspondence

Benno I. Simmons

Email: benno.simmons@gmail.com

and

Riccardo Di Clemente

Email: r.diclemente@ucl.ac.uk

Search for more papers by this author
First published: 12 January 2019
Citations: 4

Abstract

  1. Bipartite networks are widely used to represent a diverse range of species interactions, such as pollination, herbivory, parasitism and seed dispersal. The structure of these networks is usually characterised by calculating one or more indices that capture different aspects of network architecture. While these indices capture useful properties of networks, they are relatively insensitive to changes in network structure. Consequently, variation in ecologically‐important interactions can be missed. Network motifs are a way to characterise network structure that is substantially more sensitive to changes in pairwise interactions and is gaining in popularity. However, there is no software available in R, the most popular programming language among ecologists, for conducting motif analyses in bipartite networks. Similarly, no mathematical formalisation of bipartite motifs has been developed.
  2. Here we introduce bmotif: a package for motif analyses of bipartite networks. Our code is primarily an r package, but we also provide matlab and Python code of the core functionality. The software is based on a mathematical framework where, for the first time, we derive formal expressions for motif frequencies and the frequencies with which species occur in different positions within motifs. This framework means that analyses with bmotif are fast, making motif methods compatible with the permutational approaches often used in network studies, such as null model analyses.
  3. We describe the package and demonstrate how it can be used to conduct ecological analyses, using two examples of plant–pollinator networks. We first use motifs to examine the assembly and disassembly of an Arctic plant–pollinator community and then use them to compare the roles of native and introduced plant species in an unrestored site in Mauritius.
  4. bmotif will enable motif analyses of a wide range of bipartite ecological networks, allowing future research to characterise these complex networks without discarding important meso‐scale structural detail.

1 INTRODUCTION

Bipartite networks have long been used to analyse complex systems (Diestel, 2000; Guillaume & Latapy, 2004; Newman, 2010). In ecology, they are widely used to study the structure of interactions between two groups of species, including plants and pollinators, hosts and parasitoids and plants and seed dispersers. Studies of bipartite networks have yielded many new insights. For example, they have been used to uncover widespread nestedness in mutualistic communities (Bascompte, Jordano, Melián, & Olesen, 2003) and to show that community structure is stable despite turnover in species and interactions (Dáttilo, Guimarães, & Izzo, 2013). Such studies typically describe networks with one or more indices, such as connectance (the proportion of possible interactions which are realised), nestedness (the extent to which specialist species interact with subsets of the species generalist species interact with), degree (number of partners a species has) and d′ (the extent to which a species’ interactions deviate from a random sampling of its partners).

More recently, ecologists have been using bipartite motifs to characterise network structure. Bipartite motifs are subnetworks representing interactions between a given number of species (Figure 1). These subnetworks can be considered the basic “building blocks” of networks (Milo et al., 2002). Bipartite motifs are used in two main ways. First, to calculate how frequently different motifs occur in a network; Rodríguez‐Rodríguez, Jordano, and Valido (2017) used this approach to analyse the reproductive consequences of both mutualistic and antagonistic interactions with animals. Second, to quantify species roles in a community by counting the frequency with which species occur in different positions within motifs; for example, Baker, Kaartinen, Roslin, and Stouffer (2015) used this method to demonstrate that species’ roles in host‐parasitoid networks are an intrinsic property of species. Moreover, studies of bipartite motifs in non‐biological networks have been valuable to understand similarities in trade patterns (Saracco, Di Clemente, Gabrielli, & Squartini, 2015), gauge the effects of the 2007 financial crisis on the world trade web (Saracco, Di Clemente, Gabrielli, & Squartini, 2016) and assess the similarity of stock market portfolios (Gualdi, Cimini, Primicerio, Di Clemente, & Challet, 2016).

image
All bipartite motifs containing up to six nodes (species). Large numbers identify each motif. Small numbers represent the unique positions species can occupy within motifs, following Appendix 1 in Baker et al. (2015). Lines between small numbers indicate undirected species interactions. There are 44 motifs containing 148 unique positions

The advantage of motifs is that they are significantly more sensitive to changes in network structure than the indices traditionally used to describe bipartite ecological networks. In other words, a wide diversity of network configurations can have similar values of indices such as nestedness, but far fewer network configurations have similar motif compositions. A recent analysis found that, on average, motifs capture 63% more information about network structure than even multivariate combinations of popular network‐level indices and an average of 528% more information than multivariate combinations of species‐level indices; this latter value rises to 1,076% more information in the most extreme case (Simmons, Cirtwill, et al., 2018). Thus, while indices are useful, they also have important limitations. As a simple example, the degree of a plant might show it is visited by two pollinators, while motifs could reveal that one of these pollinators is a generalist visiting three other generalist plants, while the other is a specialist visiting only the focal plant. Such distinctions can have important consequences for understanding the ecology and evolution of communities and so are essential to incorporate in network analyses. However, while the motif framework is gaining in popularity, no software currently exists to conduct motif analyses of bipartite networks in R, the most popular programming language among ecologists.

To fill this gap, we introduce bmotif: an r package, based on a formal mathematical framework, for counting motifs and species positions within motifs, in bipartite networks. While bmotif is primarily an r package, we additionally provide matlab and Python code that replicates the core package functionality. Here, we introduce the motifs and motif positions counted by bmotif and describe the package's main functions and performance. We then provide two examples showing how bmotif can be used to answer questions about ecological communities. We note that, our methods are general so can also be applied to many types of interaction, such as mutualism, parasitism and herbivory, and even non‐biological systems, such as trade networks, finance networks and recommendation systems.

2 DESCRIPTION

2.1 Defining bipartite motifs

In a bipartite network containing N species, a motif is a subnetwork comprising n species and their interactions (where ≤ N and all species have at least one interaction). Figure 1 shows the motifs included in bmotif: all 44 possible motifs containing up to six nodes. Large numbers represent the identity of each motif. Within motifs, species can appear in different positions. Nodes in a motif share the same position if there exists a permutation of these nodes, together with their links, that preserves the motif structure (see Appendix S1 for formal definition) (Kashtan, Itzkovitz, Milo, & Alon, 2004). For example, in motif 9, the left and centre nodes in the top level can be swapped without changing the motif structure, but the centre and right nodes cannot (Figure 1). The 148 unique positions a species can occupy across all motifs up to six nodes are shown in Figure 1 as small numbers associated with each node. These positions are important because each represents a different ecological situation with a unique set of direct and indirect interactions. For example, in motif 3 both species in the top level are in the same position (position 6), indicating that they have identical topological roles: both have a single interaction with the shared resource in position 5. Conversely, in motif 5, both top‐level species are in different positions (12 and 11), which can have important functional consequences. For example, while the species in position 11 is a specialist on the resource in position 10, the species in position 12 has a wider diet breadth, interacting with species in positions 9 and 10 and thus having greater redundancy in its partners. Motifs and positions are ordered as in Baker et al. (2015, Appendix 1).

Networks in bmotif are represented as biadjacency matrices (M), with one row for each species in the first set (such as pollinators) and one column for each species in the second set (such as plants). When species i and j interact, mij > 0; if they do not interact mij = 0. This widely used representation was chosen for compatibility with other packages and open‐access network repositories, such as the Web of Life (www.web-of-life.es). Species in rows correspond to nodes in the top level of the motifs in Figure 1; species in columns correspond to nodes in the bottom level. Appendix S2 shows how each motif is represented in a biadjacency matrix.

2.2 Main functions

bmotif has two functions: (a) mcount, for calculating how frequently different motifs occur in a network and (b) node_positions, for calculating the frequency with which species (nodes) occur in different positions within motifs to quantify a species’ structural role. To enumerate motif frequencies and species position counts, bmotif uses mathematical operations directly on the biadjacency matrix: for the first time, we derive 44 expressions for each of the 44 motifs and 148 expressions for each of the 148 positions within motifs (Appendix S3). The advantage of this approach is that analyses with bmotif are fast: using a dataset of 175 empirical pollination and seed dispersal networks, mcount completed in 0.01 s and node_positions completed in 0.32 s for a network with 78 species (close to the mean network size of 77.1 species) and for motifs up to six nodes. Appendix S4 gives full details and analyses of bmotif's computational performance while Appendix S5 provides a detailed description of the outputs returned by the two functions.

3 EXAMPLE ANALYSES

3.1 Comparing community structures

Here we use bmotif to examine the assembly and disassembly of an Arctic plant–pollinator community. Networks were sampled daily, when weather conditions allowed, at the Zackenberg Research Station in northeastern Greenland, across two full seasons in 1996 (24 days) and 1997 (26 days) (Olesen, Bascompte, Elberling, & Jordano, 2008). While these networks use the frequency of animal visits to plants as a surrogate for true pollination, this has been shown to be a reasonable proxy in mutualistic networks (Simmons, Sutherland, et al., 2018; Vázquez, Morris, & Jordano, 2005). Data were obtained from Saavedra, Rohr, Olesen, and Bascompte (2016). We used mcount to calculate motif frequencies in each daily network in both years, normalised using “normalise_nodesets”, which expresses the frequency of each motif as the number of sets of species that form the motif as a proportion of the number of sets of species that could form that motif (see Appendix S5; Poisot & Stouffer, 2016). Days 1 and 24 in 1996 and days 1 and 26 in 1997, were excluded from the analysis as they were too small for some motifs to occur. Table 1 shows the data frame returned by mcount for an example daily network (day 12 in 1996) and Figure 2b visualises the distribution of motifs in this network. Using nonmetric multidimensional scaling (NMDS), we visualised how the community structure changed from assembly after the last snow melt to disassembly at the first snow fall, in two consecutive years (Figure 2a). NMDS is an ordination technique that attempts to represent the pairwise dissimilarities between multidimensional data in a lower‐dimensional space as accurately as possible (Kruskal, 1964). NMDS can be used with any dissimilarity measure and is regarded as one of the most robust ordination techniques in ecology (Minchin, 1987). NMDS analyses were conducted with the metaMDS function in the r package vegan using Bray–Curtis dissimilarities (Oksanen et al., 2016). We used Bray–Curtis dissimilarity as it is a robust dissimilarity measure for a wide range of community traits, including motifs (Baker et al., 2015; Simmons, Cirtwill, et al., 2018). More positive values of the first NMDS axis are associated with motifs where generalist pollinators compete for generalist plants, while negative values are associated with motifs where more specialist pollinators have greater complementarity in the specialist plants they visit. More positive values of the second NMDS axis are associated with loosely connected motifs containing specialist plants interacting with both specialist and generalist pollinators, while negative values are associated with highly‐connected motifs containing pollinators competing for generalist plants. While the community was relatively stable over time in the 1996 season, there were larger structural changes in 1997, with a largely monotonic shift from high competition between generalist pollinators at the start of the season, to lower competition between more specialist pollinators at the end of the season, with a more complementary division of plant resources (Figure 2a). Thus while network structure may appear stable when analysed with traditional indices such as connectance (Olesen et al., 2008), motifs reveal the presence of complex, ecologically‐important structural dynamics. Additionally, it is clear that, even in consecutive years, the community followed different structural trajectories, emphasising the danger of treating networks as static entities.

Table 1. The data frame returned by mcount for an example daily network from Zackenberg Research Station in northeastern Greenland (day 12 in 1996). Details of the different columns are given in Appendix S5
Motif Nodes Frequency normalise_sum normalise_sizeclass normalise_nodesets
1 2 140 0.00200194 1 0.34313725
2 3 621 0.00888005 0.57393715 0.13235294
3 3 461 0.00659212 0.42606285 0.14123775
4 4 1,153 0.01648744 0.1370661 0.07064951
5 4 4,486 0.06414803 0.53328578 0.11951194
6 4 831 0.01188297 0.09878745 0.02213875
7 4 1,942 0.02776983 0.23086068 0.05644036
8 5 2,393 0.03421896 0.03968623 0.04189426
9 5 10,689 0.15284848 0.17726956 0.05695332
10 5 5,243 0.07497283 0.08695147 0.02793585
11 5 5,941 0.08495396 0.09852731 0.03165494
12 5 901 0.01288394 0.01494245 0.00480072
13 5 12,815 0.18324944 0.21252778 0.04655531
14 5 8,564 0.12246182 0.14202793 0.03111195
15 5 8,002 0.11442544 0.13270755 0.02907027
16 5 1,096 0.01567237 0.01817639 0.00398163
17 5 4,654 0.06655036 0.07718332 0.02576367
image
(a) Nonmetric multidimensional scaling plot (NMDS) showing change in Arctic plant–pollinator network structure over the 1996 and 1997 seasons, quantified using motifs. Numbers represent the days of sampling. (b) The normalised frequency of motifs in one time slice network (day 12 in 1996)

3.2 Comparing species’ structural roles

We used node_positions to compare the roles of native and introduced plant species in a plant–pollinator community sampled in Mauritius in November 2003 (Kaiser‐Bunbury, Memmott, & Müller, 2009; 48 species, 75 interactions, connectance = 0.134). Network data were obtained from the Web of Life dataset (www.web-of-life.es) and information on plant origin was obtained from Kaiser‐Bunbury et al. (2009, Appendix II). We calculated the sum‐normalised roles of all plant species (16 native and four introduced; see Table 2 for the data frame returned by node_positions and Figure 3b for the motif composition of the network) and plotted them on two NMDS axes (Figure 3a). This figure shows three striking features. First, there is almost no overlap between native and introduced species’ interaction niches. Similar to research showing that non‐native species can occupy different functional niches to native species (Ordonez, Wright, & Olff, 2010), these results suggest they may also occupy unexploited interaction niches. This aligns with previous studies showing differences in species‐level network indices between native and invasive plant species, such as higher generalisation (Albrecht, Padrón, Bartomeus, & Traveset, 2014) and species strength (Maruyama et al., 2016). Further research could use motifs to investigate whether introduced species “pushed” native species out of previously occupied interaction niche space or whether introduced species colonised previously unused space. If the latter is true, the size of a community's unused “role space” could potentially inform predictions of its vulnerability to invasion. Second, the interaction niche of introduced species is much smaller than that of native species: the four introduced species all occupy similar areas of motif space, possibly suggesting a single “invader role”. This could have important implications for predicting the effects of invasive species on community structure, an important challenge especially in the face of global changes. While previous studies have identified species and community traits that predict the identity of invasive species or communities vulnerable to invasion, it has recently been argued that species topological roles are a more practical predictor of how species could affect communities because they are comparatively easier to sample (Emer, Memmott, Vaughan, Montoya, & Tylianakis, 2016). Thus, our finding could lay the foundation for future work predicting which species will become invasive based on their motif roles alone, especially given evidence that species roles are conserved across native and alien ranges (Emer et al., 2016). Third, introduced species occupy lower values on the second NMDS axis, corresponding to motif positions where they are visited by generalist pollinator species, possibly due to the absence of co‐evolutionary associations with specialists.

Table 2. The data frame returned by node_positions for the Mauritius plant–pollinator network. Details of this output are given in Appendix S5. For visualisation purposes, only columns 1–6 and 46 are shown
np1 np2 np3 np4 np5 np6 np46
Sideroxylon puberulum 0.000000 0.003380 0.000000 0.010140 0.000000 0.011589 0.016900
Grangeria borbonica 0.000000 0.002259 0.000000 0.007905 0.000000 0.008752 0.019763
Badula platiphylla 0.000000 0.002629 0.000000 0.005258 0.000000 0.009989 0.002629
Helichrysum proteoides 0.000000 0.001903 0.000000 0.011415 0.000000 0.005854 0.104639
Myonima violacea 0.000000 0.002358 0.000000 0.001179 0.000000 0.014151 0.000000
Harungana madagascariensis 0.000000 0.002494 0.000000 0.002494 0.000000 0.012469 0.000000
Stillingia lineata 0.000000 0.001832 0.000000 0.000916 0.000000 0.010989 0.000000
Ochna mauritiana 0.000000 0.001793 0.000000 0.002689 0.000000 0.012550 0.000448
Olea lancea 0.000000 0.001768 0.000000 0.000884 0.000000 0.011494 0.000000
Psiadia terebinthina 0.000000 0.002208 0.000000 0.007728 0.000000 0.008832 0.019321
Aphloia theiformis 0.000000 0.001570 0.000000 0.000000 0.000000 0.014129 0.000000
Psidium cattleianum 0.000000 0.002469 0.000000 0.002469 0.000000 0.009877 0.000000
Coffea macrocarpa 0.000000 0.002847 0.000000 0.004270 0.000000 0.012100 0.000712
Homalanthus populifolius 0.000000 0.001832 0.000000 0.000916 0.000000 0.010989 0.000000
Faujasiopsis flexuosa 0.000000 0.001605 0.000000 0.001605 0.000000 0.012841 0.000000
Gaertnera sp1 0.000000 0.002956 0.000000 0.004435 0.000000 0.013304 0.000739
Coffea mauritiana 0.000000 0.011236 0.000000 0.000000 0.000000 0.022472 0.000000
Gaertnera rotundifolia 0.000000 0.004975 0.000000 0.000000 0.000000 0.014925 0.000000
Warneckea trinervis 0.000000 0.001570 0.000000 0.000000 0.000000 0.014129 0.000000
Wikstroemia indica 0.000000 0.001020 0.000000 0.000000 0.000000 0.012245 0.000000
image
(a) The roles of native and introduced species in a plant–pollinator network. Each point represents the role of a species in the network. Shaded polygons are convex hulls either containing all introduced species or all alien species. (b) The normalised frequency of motifs in the network

4 IMPLEMENTATION AND AVAILABILITY

The bmotif package is available for the R programming language. The package can be installed in r using install.packages (“bmotif”). This paper describes version 1.0.0 of the software. The package is in active development and version 2.0.0, which adds support for weighted networks, will be released soon. The source code of the package is available at https://github.com/SimmonsBI/bmotif. Any problems can be reported using the Issues system. The code is version controlled with continuous integration and has code coverage of approximately 98%. matlab and Python code replicating the core package functionality is available at https://github.com/SimmonsBI/bmotif-matlab and https://github.com/SimmonsBI/bmotif-python respectively. All code is released under the MIT license.

5 CONCLUSIONS

bmotif is an r package and set of mathematical formulae enabling motif analyses of bipartite networks. Specifically, bmotif provides functions for two key analyses: (a) enumerating the frequency of different motifs in a network and (b) calculating how often species occur in each position within motifs. These two techniques capture important information about network structure that may be missed by traditional methods. As an illustration, by analysing the roles of native and introduced plant species in a plant–pollinator network, we found that introduced species adopted similar roles in the community that differed from those of native species. Motif approaches represent a new addition to the network ecologists’ “toolbox” for use alongside other techniques to analyse bipartite networks. We hope bmotif encourages further uptake of the motif approach to shed light on the ecology and evolution of ecological communities.

ACKNOWLEDGEMENTS

B.I.S. is supported by the Natural Environment Research Council as part of the Cambridge Earth System Science NERC DTP [NE/L002507/1]. B.I.S., M.J.M.S. and M.S. acknowledge the Cambridge Faculty of Mathematics Bridgwater Summer Research Fund/CMP bursary fund for support. L.V.D. is funded by the Natural Environment Research Council (grant code: NE/N014472/1). W.J.S. is funded by Arcadia. R.D.C. as Newton International Fellow of the Royal Society acknowledges support from The Royal Society, The British Academy and the Academy of Medical Sciences (Newton International Fellowship, NF170505).

    AUTHORS’ CONTRIBUTIONS

    B.I.S. conceived the project, conducted analyses and wrote the first manuscript draft. B.I.S., M.J.M.S., M.S. and R.D.C. developed the package. B.I.S., W.J.S., L.V.D. and R.D.C. planned the study. B.I.S. and R.D.C. coordinated and designed the work. All authors contributed to writing.

    DATA ACCESSIBILITY

    All networks are available from the Web of Life repository (www.web-of-life.es), with the exception of the Greenland plant–pollinator networks which are available from Data Dryad https://doi.org/10.5061/dryad.3pk73 (Saavedra et al., 2016). To obtain the Web of Life networks go to www.web-of-life.es, click “Pollination”, then click “Download”; next, repeat this process but click “Seed dispersal” rather than “Pollination” in the second step. Network names have the format “M_T_X” where T is the type of interaction (PL for pollination, SD for seed dispersal) and X is the network identity. Where =  PL, remove all networks where > 071; where =  SD, remove all networks where > 034. Networks with identity values greater than these were added to the Web of Life repository after our analyses were conducted. Finally, remove “M_PL_057” and “M_PL_062” networks as these were unusually large containing c. 1,000 species or more.

    Plant origin data for Mauritius networks was from Kaiser‐Bunbury et al. (2009, Appendix II) (paper https://doi.org/10.1016/j.ppees.2009.04.001; Appendix link: https://ars.els-cdn.com/content/image/1-s2.0-S1433831909000183-mmc8.doc). The owners of these data had to deny the request to archive them in a repository that meets the requirements of the BES Data Archiving Policy due to the policies of the journal that they published their article in.

      Number of times cited according to CrossRef: 4

      • Influence of taxonomic resolution on mutualistic network properties, Ecology and Evolution, 10.1002/ece3.6060, 10, 7, (3248-3259), (2020).
      • How biased is our perception of plant-pollinator networks? A comparison of visit- and pollen-based representations of the same networks, Acta Oecologica, 10.1016/j.actao.2020.103551, 105, (103551), (2020).
      • Characterising ecological interaction networks to support risk assessment in classical biological control of weeds, Current Opinion in Insect Science, 10.1016/j.cois.2019.12.002, (2020).
      • Motif-based spectral clustering of weighted directed networks, Applied Network Science, 10.1007/s41109-020-00293-z, 5, 1, (2020).