# orchaRd 2.0: An R package for visualising meta-analyses with orchard plots

**Handling Editor:**Natalie Cooper

## Abstract

en

- Although meta-analysis has become an essential tool in ecology and evolution, reporting of meta-analytic results can still be much improved. To aid this, we have introduced the orchard plot, which presents not only overall estimates and their confidence intervals, but also shows corresponding heterogeneity (as prediction intervals) and individual effect sizes.
- Here, we have added significant enhancements by integrating many new functionalities into orchaRd 2.0. This updated version allows the visualisation of heteroscedasticity (different variances across levels of a categorical moderator), marginal estimates (e.g. marginalising out effects other than the one visualised), conditional estimates (i.e. estimates of different groups conditioned upon specific values of a continuous variable) and visualisations of all types of interactions between two categorical/continuous moderators.
- orchaRd 2.0 has additional functions which calculate key statistics from multilevel meta-analytic models such as
*I*^{2}and*R*^{2}. Importantly, orchaRd 2.0 contributes to better reporting by complying with PRISMA-EcoEvo (preferred reporting items for systematic reviews and meta-analyses in ecology and evolution). Taken together, orchaRd 2.0 can improve the presentation of meta-analytic results and facilitate the exploration of previously neglected patterns. - In addition, as a part of a literature survey, we found that graphical packages are rarely cited (~3%). We plea that researchers credit developers and maintainers of graphical packages, for example, by citations in a figure legend, acknowledging the use of relevant packages.

## Streszczenie

pl

- Chociaż metaanaliza stała się podstawowym narzędziem w ekologii i ewolucji, raportowanie wyników metaanalizy trudne. Aby je ułatwić, wprowadziliśmy wykres orchard, który przedstawia nie tylko ogólne (średnie) oszacowanie efektu i jego przedziały ufności, ale także pokazuje odpowiadającą mu heterogeniczność (jako przedziały predykcji) i wielkości efektów z poszczególnych prób.
- W drugiej wersji pakietu, orchaRd 2.0 dodaliśmy znaczące ulepszenia, wbudowując w niego wiele nowych funkcjonalności. Zaktualizowana wersja pakietu pozwala na wizualizację heteroskedastyczności (uwzględniającej różne wariancje na poziomach moderatora kategorycznego), średnich brzegowych (np. przy marginalizacji efektów innych niż wizualizowany), średnich warunkowych (np. oszacowań różnych grup dla konkretnych wartości zmiennej ciągłej) oraz wizualizacje wszystkich interakcji pomiędzy dwoma moderatorami, zarówno kategorycznymi jak i ciągłymi.
- orchaRd 2.0 posiada dodatkowe funkcje obliczające kluczowe statystyki z wielopoziomowych modeli metaanalitycznych, takie jak
*I*^{2}i*R*^{2}. Co ważne, orchaRd 2.0 przyczynia się do lepszego raportowania wyników poprzez zgodność z PRISMA-EcoEvo (Preferred Reporting Items for Systematic Reviews and Meta-Analyses in Ecology and Evolution). W efekcie, orchaRd 2.0 może poprawić prezentację wyników metaanalitycznych i ułatwić eksplorację wcześniej pomijanych wzorców. - Dodatkowo, w ramach systematycznego przeglądu literatury, stwierdziliśmy, że pakiety graficzne są rzadko cytowane (~3%). Zwracamy się z prośbą do badaczy, aby docenili twórców i osoby rozwijające pakiety graficzne, np. poprzez cytowanie wykorzystanych pakietów w opisie wykresów.

## 1 INTRODUCTION

Meta-analysis has become an essential synthesis tool across the medical, social and biological sciences (Cooper et al., 2019; Gurevitch et al., 2018; Higgins et al., 2019; Schmid et al., 2021). In fields such as medicine, meta-analytic results are typically shown in a forest plot that presents effect sizes and their 95% confidence intervals (CIs) from each study in the meta-analysis. However, in ecology and evolution, forest plots are infrequently used because meta-analyses in this field often include >100 effect sizes, making a traditional forest plot impractical (Gurevitch et al., 2018; Senior et al., 2016). Instead, researchers use a ‘forest-like plot’ with the overall mean effect size estimate and their 95% CIs for different levels of a categorical moderator (predictor variable). Such estimates are derived from a meta-regression model or from subset/sub-group analyses. For example, such a plot could show estimates from five different taxa, six different geographical areas or three different methods. A recent survey found 72 out of 102 ecological and evolutionary meta-analyses presented forest-like plots (Nakagawa et al., 2021). Contributing to the popularity of forest-like plots is the fact that meta-analytic moderators are often categorical rather than continuous variables. Despite their popularity, forest-like plots in ecology and evolution often lack important information such as individual effect sizes and estimates of heterogeneity among effect sizes (Nakagawa et al., 2021; Schild & Voracek, 2015).

Nakagawa et al. (2021) introduced an information-rich version of a forest-like plot, named the ‘orchard’ plot. Orchard plots provide (1) point estimates (i.e. meta-analytic means); (2) CIs, (3) prediction intervals (PIs; which show heterogeneity among effect sizes); and (4) individual effect sizes scaled by their precision (the inverse of the square root of the sampling variance). Nakagawa et al. (2021) implemented the orchard plot using functions that use the most popular and comprehensive meta-analysis *R* package, metafor (Viechtbauer, 2010) and ggplot2 graphics (Wickham, 2009). However, the original implementation (orchaRd 1.0) was limited to single moderator meta-regression models. In addition, it was only possible to visualise a meta-regression model that assumed homoscedasticity across levels of the single categorical moderator (i.e. all levels have the same variance, which may be unrealistic, e.g. Wilson et al., 2022; Zajitschek et al., 2020).

In this article, we enhance the visualisation capabilities of the orchaRd package by integrating the functionalities of the R package emmeans (Lenth et al., 2018) in four ways. The first three extend orchard plots by allowing visualisation of (I) heteroscedasticity (different variances across levels of a categorical moderator; Section 3.1); (II) marginal estimates (e.g. marginalising all other moderators apart from the one visualised; Section 3.2) and (III) conditional estimates (i.e. estimates of different groups/levels of a categorical variable, conditioned upon specific values of a continuous variable; Section 3.3). The fourth capability allows for ‘bubble’ plots to be created of (i) a continuous variable, (ii) interactions between a continuous and categorical variable and (iii) interactions between two continuous variables from multi-moderator models (Section 3.4). In addition, we add helper functions to calculate key statistics from multilevel meta-analytic models such as *I*^{2} (Cheung, 2014) and *R*^{2} (Aloe et al., 2010; Nakagawa & Schielzeth, 2013), along with their CIs (Section 3.5). These new functionalities not only better visualise meta-analytic results in ecology and evolution, but also facilitate the exploration of previously neglected patterns, such as heteroscedasticity in meta-analytic data. Throughout we support the motivation for creating these functionalities with a survey of meta-analyses in ecology and evolution (Section 2).

Notably, orchaRd 2.0 improves reporting transparency in a meta-analysis by following the ‘Preferred Reporting Items for Systematic reviews and Meta-Analyses in Ecology and Evolution’ (PRISMA-EcoEvo; O'Dea et al., 2021). Our package's vignette also provides detailed instructions and examples on how to use all the main functions, as well as how to customise plots (https://daniel1noble.github.io/orchaRd/).

## 2 SURVEY METHODS

To gauge the potential usefulness of the orchaRd package's extensions, we surveyed 102 meta-analyses in ecology and evolution. Notably, this dataset was initially collected to quantify reporting quality of ecological and evolutionary meta-analyses to assist in creating PRISMA-EcoEvo (O'Dea et al., 2021). Briefly, we obtained 102 articles with meta-analyses that were published between 1 January 2010 and 25 March 2019 and part of the ‘Ecology’ and ‘Evolutionary Biology’ journals classified under the InCites Journal Citation Reports (Clarivate Analytics; see more details in O'Dea et al., 2021). We previously explored this dataset to survey the use of forest and forest-like plots in ecology and evolution (Nakagawa et al., 2021).

- Q1: How many papers have at least one categorical variable/moderator? (Defining a moderator as a predictor in a meta-regression analysis).
- Q2: How many papers have at least one test or model for heteroscedasticity?
- Q3: How many papers have at least one model with more than one categorical moderator?
- Q4: How many papers have at least one model with at least one categorical moderator and one continuous moderator?
- Q5: How many papers that used a multi-moderator regression have at least one forest-like plot (figure) made from the multi-moderator meta-regression?
- Q6: How many papers that used a multi-moderator regression also modelled interactions?
- Q7: How many papers, which use
*R*, cite an*R*software package they used for meta-analysis? - Q8: How many papers, which use
*R*, cite an*R*software package they used for the graphical presentation of meta-analytic results?

We report relevant results below, but the full results of this survey can be found in the Supporting Information.

## 3 NEW SOFTWARE CAPABILITIES

The orchaRd 2.0 package has six main functions with three different (‘table’, ‘figure’ and ‘statistics’) functionalities: (1) mod_results (creating a table or a table function; see Figure 1), (2) orchard_plot (a figure function), (3) bubble_plot (a figure function), (4) caterpillars (a figure function), (5) i2_ml (calculating *I*^{2} statistics or a statistics function) and (6) r2_ml (a statistics function; each function's description is found in Table 1). Among these six functions, the core function is orchard_plot. This function enables users to draw orchard plots from a table created by mod_results, which uses emmeans functionality (Lenth et al., 2018) to process metafor model objects (object classes: rma, rma.mv and robust.rma; Viechtbauer, 2010). Below we first showcase three new capabilities of orchard_plot. Then, we describe a new function, bubble_plot, followed by the other main functions (caterpillars, i2_ml and r2_ml). Notably, the focus of our orchaRd package is to visualise multilevel meta-analytic models, which deal with two different types of non-independence due to (1) correlated effect sizes (e.g. multiple effect sizes per study) and (2) correlated sampling errors (e.g. shared control groups or shared measurements; see Noble et al., 2017). The former requires adding random effects (e.g. study ID), while the latter requires modelling a within-study variance–covariance matrix (note one can use the vcalc function in metafor to create such a matrix; Viechtbauer, 2010).

Function | Category | Description |
---|---|---|

mod_results | Table | mod_results takes multi-level meta-analytic and meta-regression models (with multiple moderators—continuous or categorical) of class rma.mv/rma/robust.rma and calculates mean or marginalised mean meta-analytic estimates across all levels of a given moderator or overall (i.e. intercept only). The mod_results table can then be used with orchard_plot, bubble_plot or caterpillars to plot results graphically. If a multivariate meta-regression model (with many moderators) is provided, users can specify the ‘by’ and/or ‘at’ arguments to marginalise over desired levels of other moderators |

orchard_plot | Figure | Modified forest plot that plots the meta-analytic means, CIs, prediction intervals and raw data for each level of a categorical moderator. Users can use a number of arguments for modifying the look of plots including the legend, colour schemes, size and weight of points and lines and angle and naming of text on the axes. Sub-setting allows the users to plot a subset of the levels for a given moderator. Additional modifications can be made by adding and modifying layers of the ggplot object. Plots can be made using either mod_results objects directly or using the rma.mv/rma/robust.rma model object in combination with the raw data. If a multivariate meta-regression model (many moderators) is provided directly users can specify the ‘by’ and/or ‘at’ arguments to marginalise over desired levels of other moderators |

bubble_plot | Figure | Creates a bubble plot(s) depicting the predicted mean effect size, confidence and prediction interval as a function of a continuous moderator (slope estimate) or a series of separate plots showing predictions across an additional moderator (i.e. interaction plots). Plots can be made using either mod_results objects directly or using the rma.mv/rma/robust.rma model object in combination with the raw data. Raw data are plotted, and point size is adjusted according to effect size precision |

caterpillars | Figure | Creates a caterpillar plot from an intercept model or from mean effect size estimates for all levels of a given categorical moderator, their corresponding confidence and prediction intervals. Plots can be made using either mod_results objects directly or using the rma.mv/rma/robust.rma model object in combination with the raw data |

i2_ml | Statistics | Calculates heterogeneity statistics using measures of I^{2} for a multilevel meta-analytic or meta-regression models. Point estimates can be calculated quickly for each level of random effect along with an estimate of total heterogeneity. Users also have the option of generating 95% CIs for all I^{2} estimates using the ‘boot’ argument (percentile method). This argument will conduct parametric bootstrapping |

r2_ml | Statistics | Calculates marginal and conditional R^{2} for multilevel meta-analytic or meta-regression models. Point estimates can be calculated quickly using a couple of different methods, but users also have the option of generating 95% CIs for R^{2} using the ‘boot’ argument (percentile method). This argument will conduct parametric bootstrapping |

### 3.1 Orchard plots: Heteroscedasticity

Categorical variables (moderators) are extremely common in meta-analyses. In our survey, >97% of the papers had at least one categorical variable. The categorical variable was used to subset data for sub-group analyses, where a series of meta-analyses (intercept models) were run, or to fit a meta-regression model (Q1). In many meta-analyses, researchers assumed all levels of a categorical moderator had the same variation (homoscedasticity). Our survey shows that only 5% of papers investigated heteroscedasticity, while others assumed homoscedasticity (Q2). Yet, differences in variances can be as biologically insightful as differences in means among groups. For example, Pottier et al. (2022) found that not only were aquatic ectotherms more thermally plastic than their terrestrial counterparts, but their plastic responses were much more variable than those of terrestrial ectotherms (even after considering the sample size difference). Our orchard_plot now allows for visualisation of modelled heteroscedasticity by depicting different PIs for different groups (Figure 1). Of importance, modelling heteroscedasticity, when it exists, might reduce Type 1 error (Rubio-Aparicio et al., 2017, 2020); and orchard plots can assist meta-analysts in finding heteroscedasticity. Incidentally, modelling heteroscedasticity for a categorical moderator becomes essential if one wants to obtain absolute group means (e.g. selection gradients; Kingsolver et al., 2012; Siepielski et al., 2017; see also Noble et al., 2018). Absolute estimates can be calculated assuming a ‘folded’ normal distribution (see Morrissey, 2016; Nakagawa & Lagisz, 2016), with the accuracy of mean magnitudes being dependent on within-group variances. As such, it is important that heteroscedasticity is evaluated if such an approach is taken.

### 3.2 Orchard plots: Marginal means

Many meta-analyses include multiple variables (moderators), and often they are modelled together in a single meta-regression model. In our survey, meta-analytic studies often modelled two or more categorical moderators together (Q3: 41%) and modelled at least one categorical moderator and one continuous moderator (Q4: 30%). Not all meta-analyses, which had multi-moderator models, reported marginal estimates (Q5: 27%). It is understandable because obtaining ‘marginal’ means becomes difficult once the number of moderators increases unless one relies on computational solutions, for example, via the emmeans package. Therefore, many meta-analysts have been using only estimates from uni-moderator models. We have now made it straightforward to produce marginal means from a multi-moderator meta-regression model using orchard_plot. It is notable that marginalisation is usually done by weighting in proportion to the frequencies in the sample (data) of different groups that are averaged over. In such a case, marginal means are often similar, if not identical, to means from a uni-moderator model. However, if ‘equal’ weighting is used (giving the same weights to all groups), marginalised means could be different from those from a uni-moderator model, especially when a categorical moderator is unbalanced between groups/levels (Figure 2). Equal weighting is, for example, useful when your sample is unequal in your dataset, but in the population, it should be ~50:50%; for example, males and females in many animals (cf. Deffner et al., 2022).

### 3.3 Orchard plots: Conditional means

As mentioned above, our survey showed that it was not uncommon to have a study with a continuous moderator and a categorical moderator (Q3: 30%). For such a combination, one can estimate group-level means (and overall means) conditioned upon specific values of a continuous moderator (Figure 3). For example, O'Dea et al. (2019) estimated how thermal environments during development affect phenotypic mean and variance. They found that increasing temperature did not change phonotypic means, while phenotypic variance increased as developmental temperature increased. Examining ‘conditional’ means is illuminating and important for statistical inference because the statistical significance of conditional estimates can change along the gradient of a continuous moderator. Yet, none of the 32 papers with a model containing at least one categorical and continuous moderator presented conditional estimates, as for example are depicted in Figure 3b (see also Vendl et al., 2022).

### 3.4 Interactions: Orchard, bubbles and bubbleless

In our survey, ~30 (out of 102) meta-analyses modelled some type of interaction (Q5). Three types of interactions might manifest in a meta-analysis, those between (1) categorical–categorical variables; (2) categorical–continuous variables and (3) continuous–continuous variables. The first type of interaction (categorical–categorical) can be easily visualised using an orchard plot because interactions between two categorical variables can be conceptualised as one categorical variable (e.g. a categorical variable with 2 levels and another with 2 levels are equivalent to a categorical variable with 4 levels; Figure 4a). If we want to see a plot with the second type (categorical–continuous), one can use bubble plots via the bubble_plot function (note that metafor also has a function for bubble plots, called regplot, which provides a single-panel interaction plot, unlike our multi-panel interaction plots; Figure 4b). The third type (continuous–continuous) is the least intuitive one to visualise, but one can also use bubble_plot to draw ‘bubbleless’ plots, which are line plots with multiple panels (Figure 4c); they are bubbleless because often there are only a few or no corresponding data points to plot for a given point of one of the two continuous variables.

### 3.5 Other functions

In addition to orchard and bubble plots, the orchaRd package provides ‘caterpillar’ plots (via the function caterpillars, which is a forest plot without labels for each effect size; see our vignette—https://daniel1noble.github.io/orchaRd/). We also present two new non-plot functions to give meta-analysts convenient tools to quantify heterogeneity and variances explained by multilevel meta-analyses. The function i2_ml calculates *I*^{2}, which is the percentage of variation among effect sizes not driven by sampling error (much of which is due to differences in sample sizes across studies; Higgins & Thompson, 2002). Our function not only calculates the original *I*^{2} (referred to as ‘total’ *I*^{2}) but heterogeneity explained by each additional random effect in the model (e.g. heterogeneity due to study ID or due to species ID; sensu Nakagawa & Santos, 2012). Furthermore, different sets of *I*^{2} values can be calculated for different groups (levels) for a categorical moderator model with heteroscedasticity. While *I*^{2} is estimated from a meta-analytic (intercept-only) model, *R*^{2} is used to quantify variance (heterogeneity) accounted by moderators. The function r2_ml calculates marginal *R*^{2}, proposed by Nakagawa and Schielzeth (2013) as a pseudo-*R*^{2} for linear mixed-effects models. Notably, both i2_ml and r2_ml can provide 95% CIs, using bootstrapping.

## 4 IMPROVING REPORTING

### 4.1 PRISMA-EcoEvo and orchaRd

O'Dea et al. (2021) recommend information to be reported in systematic reviews and meta-analyses in ecology and evolution. Visualisations from the orchard package are completely consistent with reporting recommendations of PRISMA-EcoEvo. This is especially so with three (sub-)items, recommended for the Method section: (1) presenting the numbers of studies and effect sizes for each estimate; (2) reporting indicators of heterogeneity; and (3) including estimates and CIs for moderators. The survey conducted by O'Dea et al. (2021) showed very poor reporting of these items: 57%, 52% and 59%, respectively. As one can see, our package takes care of these three items in a single orchard plot (Figures 1-4). It is notable that now orchard plots even visualise different heterogeneities among different groups (i.e. heteroscedasticity) via PIs.

### 4.2 Plea and proposal

Graphical presentation can facilitate better reporting in meta-analyses. However, in our survey, only 2 papers (3.1%) out of 64 articles which used *R*, cited any graphical package(s) used for visualising meta-analytic results (e.g. orchaRd; Q8). This figure starkly contrasts with 85% of the papers (55 out of 64; Q7) citing the software packages used for meta-analyses (e.g. metafor). This survey result marks a severe under-recognition of graphical packages. The real-world risk here is that this lack of recognition severely disincentivises developers from maintaining and further developing graphical packages.

We argue that authors should acknowledge graphical packages used for presenting meta-analyses (or any research article, for that matter), just as they do with any statistical package. We propose that graphical packages that were used to make a figure should be listed at the end of the figure legend. This standardised reporting format will mean packages do not necessarily need to be listed in the methods, but they will still be given credit. We note, however, that an *R* package can have many dependencies (i.e. other required *R* packages other than ‘base’ packages). For example, orchaRd 2.0 is dependent on emmeans, ggplot2 and metafor. We freely admit that we do not have a satisfying answer on whether dependencies should also be credited. However, for now we think it is reasonable to suggest that researchers provide the reference (in a figure legend or main text) for the immediate *R* function and package they used to make their figure.

## 5 CONCLUSIONS

As the presence and influence of meta-analyses grow in the field, it is more important than ever to visualise meta-analytic results in an information-rich manner. Here, we have introduced an expanded version of orchaRd (version 2.0), which enables researchers to readily visualise complex as well as simple meta-analytic results, a task that was previously difficult for many. New functionalities that allow for marginal and conditional means to be plotted will improve model communication by allowing for a holistic visual interpretation of the complex numerical information generated by the analysis (see Figure 1-4). Also, we introduce functions for calculating *I*^{2} and *R*^{2} for multilevel meta-analytic models, which have become standard in ecological and evolutionary meta-analyses. Finally, we hope our paper also becomes a reminder of the importance of acknowledging graphical packages. Adequate attribution of credits will create a more sustainable environment for developers and maintainers of graphical packages.

## AUTHOR CONTRIBUTIONS

Shinichi Nakagawa and Daniel W. A. Noble conceived the initial idea and wrote the first draft. Daniel W. A. Noble and Shinichi Nakagawa led programming and implementations from the inputs from Rose E. O'Dea, Alistair M. Senior and Patrice Pottier. Malgorzata Lagisz, Joanna Rutkowska and Yefeng Yang conducted the survey. All authors contributed to the design of the study and to editing and commenting on drafts.

## ACKNOWLEDGEMENTS

We thank Wolfgang Viechtbauer for his continuous effort to maintain and develop an amazing package, metafor. We also acknowledge that Russell Lenth made his wonderful package, emmeans compatible with metafor. Shinichi Nakagawa and Malgorzata Lagisz were supported by an ARC (Australian Research Council) Discovery Grant (DP210100812), and Daniel W. A. Noble was supported by an ARC Discovery Grant (DP210101152). Part of the writing was conducted while visiting the Okinawa Institute of Science and Technology (OIST) through the Theoretical Sciences Visiting Program (TSVP) to Shinichi Nakagawa. Open access publishing facilitated by University of New South Wales, as part of the Wiley - University of New South Wales agreement via the Council of Australian University Librarians.

## CONFLICT OF INTEREST STATEMENT

The author reported no conflict of interest.

## Open Research

# PEER REVIEW

The peer review history for this article is available at https://www.webofscience.com/api/gateway/wos/peer-review/10.1111/2041-210X.14152.

# DATA AVAILABILITY STATEMENT

All data and code that are part of the R package can be found on GitHub (https://github.com/daniel1noble/orchaRd) and Zenodo (https://doi.org/10.5281/zenodo.7928743; Nakagawa et al., 2023).