skbio.tree.gme#

skbio.tree.gme(dm, neg_as_zero=True)[source]#

Perform greedy minimum evolution (GME) for phylogenetic reconstruction.

Added in version 0.6.2.

Changed in version 0.6.3: Computational efficiency significantly improved.

Parameters:
dmskbio.DistanceMatrix

Input distance matrix containing distances between taxa.

neg_as_zerobool, optional

If True (default), convert negative branch lengths into zeros.

Returns:
TreeNode

Reconstructed phylogenetic tree.

See also

bme
nni
nj

Notes

Greedy Minimum Evolution (GME) [1] is a distance-based algorithm for phylogenetic reconstruction. It utilizes the minimum evolution (ME) principle [2] for selecting a tree topology with the lowest sum of branch lengths, as calculated using an ordinary least squares (OLS) framework [3].

GME is O(n2) in time and O(n) in space, making it a more scalable method than multiple alternatives (e.g., nj and bme). Therefore it is suitable for reconstructing very large phylogenetic trees.

GME generates an unrooted tree with variable tip heights and potentially some negative branch lengths. The first taxon in the distance matrix is always placed as a child of the root node. See the notes of nj for how to deal with unrooted trees and negative branch lengths.

A GME-generated tree may be further improved by executing the FastNNI algorithm implemented in nni (with balanced=False).

A similar but less scalable algorithm using the balanced instead of OLS framework is provided in bme.

The same methods underlying gme, bme and nni were also provided by the software package FastME [4].

Note

These scikit-bio functions were implemented following the original paper [1]. It is not guaranteed that they will precisely mirror FastME’s output. Although in practices they usually generate identical or equally optimal phylogenetic trees as FastME does.

References

[1] (1,2)

Desper, R., & Gascuel, O. (2002). Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol, 9(5), 687-705.

[2]

Rzhetsky, A., & Nei, M. (1993). Theoretical foundation of the minimum-evolution method of phylogenetic inference. Mol Biol Evol, 10(5), 1073-1095.

[3]

Cavalli-Sforza, L. L., & Edwards, A. W. (1967). Phylogenetic analysis. Models and estimation procedures. American journal of human genetics, 19(3 Pt 1), 233.

[4]

Lefort, V., Desper, R., & Gascuel, O. (2015). FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Mol Biol Evol, 32(10), 2798-2800.

Examples

Define a new distance matrix object describing the distances between five taxa: human, monkey, pig, rat, and chicken.

>>> from skbio import DistanceMatrix
>>> from skbio.tree import gme
>>> dm = DistanceMatrix([[0, 0.02,  0.18,  0.34,  0.55],
...                      [0.02,  0, 0.19, 0.35,  0.55],
...                      [0.18, 0.19,  0,  0.34,  0.54],
...                      [0.34, 0.35,  0.34,  0,  0.62],
...                      [0.55,  0.55,  0.54,  0.62,  0]],
...                     ['human','monkey','pig','rat','chicken'])

Perform Greedy Minimum Evoltuion (GME) and construct the minimum evolution tree representing the relationship between those taxa. This is returned as a TreeNode object.

>>> tree = gme(dm)
>>> print(tree.ascii_art())
          /-monkey
         |
         |          /-pig
         |---------|
---------|         |          /-rat
         |          \--------|
         |                    \-chicken
         |
          \-human