raxmlGUI 2.0: A graphical interface and toolkit for phylogenetic analyses using RAxML

raxmlGUI is a graphical user interface to RAxML, one of the most popular and widely used softwares for phylogenetic inference using maximum likelihood. Here we present raxmlGUI 2.0, a complete rewrite of the GUI which seamlessly integrates RAxML binaries for all major operating systems with an intuitive graphical front‐end to setup and run phylogenetic analyses. Our program offers automated pipelines for analyses that require multiple successive calls of RAxML, built‐in functions to concatenate alignment files while automatically specifying the appropriate partition settings, and one‐click model testing to select the best substitution models using ModelTest‐NG. In addition to RAxML 8.x, raxmlGUI 2.0 also supports the new RAxML‐NG, which provides new functionality and higher performance on large datasets. raxmlGUI 2.0 facilitates phylogenetic analyses by coupling an intuitive interface with the unmatched performance of RAxML.


| INTRODUC TI ON
Phylogenetic inference is a keystone in evolutionary biology research, providing the foundations for tackling a wide range of questions, from population dynamics to taxonomy of higher taxa (Felsenstein, 2003).
RAxML is one of the most widely used programs in phylogenetic analysis, implementing extremely fast algorithms to analyse large datasets using maximum likelihood (Stamatakis, 2014). Despite the undisputed efficiency of RAxML, the program is only available through a command-line interface. This requires users to be familiar with the shell environment and to navigate through the ever-growing number of commands implemented in the program, which may exclude many potential users without such experience. raxmlGUI (Silvestro & Michalak, 2012) is a graphical interface intended to facilitate phylogenetic analyses using RAxML by providing a graphical front-end to help users setup their analysis. Although this interface has been widely used, there are many areas of improvement in terms of accessibility, usage and performance.
Here, we present raxmlGUI 2.0, a complete rewrite of the raxml-GUI program. This version brings a new cross-platform design, novel functionalities and a seamless integration with both RAxML 8.2 and RAxML-NG, the new RAxML Next Generation . Similar to its predecessor, raxmlGUI 2.0 is designed to be easy to use as a cross-platform stand-alone program that does not require the installation of additional software or an internet connection. It provides the user with an intuitive interface with access to the model settings required to setup and run a phylogenetic analysis, and can handle datasets comprising up to many thousand terminals (Lemoine et al., 2018). The GUI additionally includes an option to choose the best fitting substitution model and provides a number of automated options to parse, concatenate and partition alignments and to run analytical pipelines combining multiple RAxML calls. raxmlGUI 2.0 targets a wide user base ranging from beginners to advanced phylogeneticists seeking an easy access to the state-of-the-art analytical tools implemented in RAxML and RAxML-NG.

| ME THODS
The program comes with pre-compiled integrated versions of RAxML for the major operating systems (MacOS, Windows, Linux), including the PTHREADS and SSE3 versions (Stamatakis, 2014) allowing the user to run faster analyses using parallel computing, when multiple CPUs are available. Pre-compiled versions of RAxML-NG are provided for MacOS and Linux. A Windows version will be added when available from the RAxML-NG development team. creates a list of taxa in the Outgroup menu button, which can be F I G U R E 1 The raxmlGUI 2.0 interface, upon loading an alignment file with nucleotide sequences. In the left panel, the input section provides options to load new alignments, create a concatenated file for partitioned datasets, and to specify partition-specific substitution matrices. The optimize button allows the user to perform model testing and automatically specify the best-fitting substitution model. The analysis section provides options to specify the type of analysis and outgroup selection. The output section gives easy access to the folder with the input files and a list of output files that appears upon completing the analysis. In the right panel, the user can select the version of RAxML (RAxML-ng shown in the figure), start the analysis and visualize the output. A tab bar on the top of the window allows users to easily setup multiple runs and switch between them used to root the tree based on a user-defined outgroup. Note that maximum likelihood trees can always be re-rooted after the analysis using tree-viewing software such as FigTree (Rambaut, 2012).
Phylogenetic analyses can be run based on different types of data: nucleotide sequences (DNA, RNA), amino acid sequences, discrete binary and multi-state characters (e.g. used for descriptions of morphological data). Since each data type requires a specific class of substitution models, raxmlGUI 2.0 automatically recognizes the data type from the loaded input file and provides the user with a dropdown menu showing all the substitution models compatible with the alignment.

| Automatic concatenation of alignments and partitions
An important feature of raxmlGUI 2.0 is the automated concatenation and partitioning of alignments, which simplifies the analysis of multiple genes or combination of different data types, for example, amino acid sequences and morphological data. After loading the first alignment, the user can add new ones to concatenate them into a single analysis. Upon loading additional alignments, raxmlGUI 2.0 performs the following tasks: • Parse the data to determine the data type (nucleotides, amino acids, multistate).
• Parse the taxa names to make sure the concatenation of sequences occurs across matching taxa even if they are listed in different order among input files.
• For any mismatch between taxa of different partitions, give option to automatically create sequences of missing data in the concatenated alignment or drop taxa with missing sequences in any partition.
• Set default partitions for the new alignments and re-compute the concatenated partition.
These features facilitate the concatenation of different alignment files, the creation of the partition files and the generation of sparse matrices resulting from the combination of datasets with different and only partly overlapping taxonomic coverage. These tools also reduce the probability of errors stemming from manually merging sequences by matching taxa names. Additionally, raxmlGUI 2.0 provides an intuitive interface to create partitions within a single alignment file, including the possibility to specify codon based evolutionary models for coding nucleotide sequences ( Figure 2). Finally, the user can load their own partition files, which must be provided in a RAxML compatible format ( Figure 1).

| Support for both RAxML 8.x and RAxML-NG
In addition to RAxML 8.x, raxmlGUI 2.0 adds support for RAxML Next Generation , which provides new options and improved performance for very large datasets, which are typical for the analyses of genomic data. Among the novel methods implemented in RAxML-NG, and available through raxmlGUI 2.0, is the Transfer Bootstrap Expectation algorithm to quantify topological support for a tree (Lemoine et al., 2018). This algorithm has been shown to outperform the traditional bootstrap analysis (Felsenstein, 1985) when applied to large phylogenetic trees (thousands of tips). The user can select which version of RAxML they want to run from the GUI, and the available settings are automatically updated for the specific version. For guidelines of which RAxML version to use for particular objectives and datasets, please refer to Kozlov et al. (2019).

| Model testing
One of the advantages of RAxML-NG over RAxML is its increased range of available substitution models for nucleotide and amino acid data. This feature also allows users to define different substitution models for each partition, for example, when analysing concatenated genes. To facilitate the use of these features, we implemented a model testing feature in raxmlGUI 2.0 that allows the user to select the best substitution model based on the corrected Akaike Information Criterion (AICc; Burnham & Anderson, 2002).
Model testing is carried out using the program ModelTest-NG , and is seamlessly integrated within raxml-GUI 2.0 through the OPTIMIZE button ( Figure 1). The test can be run separately for each partition and the best model will be specified automatically for the following analysis. As for RAxML-NG, ModelTest-NG is currently provided for MacOS and Linux, whereas Windows support will be added as soon as a compatible version is made available by the ModelTest-NG development team.

| Performance and implementation
There is no performance difference between running RAxML on the command line and running it from the GUI as raxmlGUI 2.0 just forwards all settings as parameters to the command line version of RAxML and runs that as a separate process. raxmlGUI 2.0 also supports a tabbed interface for running multiple analyses in parallel

| CON CLUS ION
We presented a graphical interface providing an intuitive and userfriendly access to the high-performance phylogenetic software RAxML and RAxML-NG, without compromising performance. Our implementation allows students, professionals and researchers to use the latest, state-of-the-art methods to build robust phylogenetic hypotheses, irrespective of their computing skills. We hope research and teaching in different fields involving phylogenetic inferencefrom evolutionary biology to taxonomy, from drug discovery to epidemiology-can benefit from using our program.

ACK N OWLED G EM ENTS
We thank I. Michalak, three anonymous reviewers, and many raxmlGUI users for feedback on the program. D.E. was supported F I G U R E 2 The raxmlGUI 2.0 partition editor. A graphical user interface makes it easy to define a partition for individual alignments