skbio.tree.gme#
- skbio.tree.gme(dm, neg_as_zero=True)[source]#
Perform greedy minimum evolution (GME) for phylogenetic reconstruction.
Added in version 0.6.2.
Changed in version 0.6.3: Computational efficiency significantly improved.
- Parameters:
- dmskbio.DistanceMatrix
Input distance matrix containing distances between taxa.
- neg_as_zerobool, optional
If True (default), convert negative branch lengths into zeros.
- Returns:
- TreeNode
Reconstructed phylogenetic tree.
Notes
Greedy Minimum Evolution (GME) [1] is a distance-based algorithm for phylogenetic reconstruction. It utilizes the minimum evolution (ME) principle [2] for selecting a tree topology with the lowest sum of branch lengths, as calculated using an ordinary least squares (OLS) framework [3].
GME is O(n2) in time and O(n) in space, making it a more scalable method than multiple alternatives (e.g.,
nj
andbme
). Therefore it is suitable for reconstructing very large phylogenetic trees.GME generates an unrooted tree with variable tip heights and potentially some negative branch lengths. The first taxon in the distance matrix is always placed as a child of the root node. See the notes of
nj
for how to deal with unrooted trees and negative branch lengths.A GME-generated tree may be further improved by executing the FastNNI algorithm implemented in
nni
(withbalanced=False
).A similar but less scalable algorithm using the balanced instead of OLS framework is provided in
bme
.The same methods underlying
gme
,bme
andnni
were also provided by the software package FastME [4].Note
These scikit-bio functions were implemented following the original paper [1]. It is not guaranteed that they will precisely mirror FastME’s output. Although in practices they usually generate identical or equally optimal phylogenetic trees as FastME does.
References
[1] (1,2)Desper, R., & Gascuel, O. (2002). Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol, 9(5), 687-705.
[2]Rzhetsky, A., & Nei, M. (1993). Theoretical foundation of the minimum-evolution method of phylogenetic inference. Mol Biol Evol, 10(5), 1073-1095.
[3]Cavalli-Sforza, L. L., & Edwards, A. W. (1967). Phylogenetic analysis. Models and estimation procedures. American journal of human genetics, 19(3 Pt 1), 233.
[4]Lefort, V., Desper, R., & Gascuel, O. (2015). FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Mol Biol Evol, 32(10), 2798-2800.
Examples
Define a new distance matrix object describing the distances between five taxa: human, monkey, pig, rat, and chicken.
>>> from skbio import DistanceMatrix >>> from skbio.tree import gme
>>> dm = DistanceMatrix([[0, 0.02, 0.18, 0.34, 0.55], ... [0.02, 0, 0.19, 0.35, 0.55], ... [0.18, 0.19, 0, 0.34, 0.54], ... [0.34, 0.35, 0.34, 0, 0.62], ... [0.55, 0.55, 0.54, 0.62, 0]], ... ['human','monkey','pig','rat','chicken'])
Perform Greedy Minimum Evoltuion (GME) and construct the minimum evolution tree representing the relationship between those taxa. This is returned as a TreeNode object.
>>> tree = gme(dm) >>> print(tree.ascii_art()) /-monkey | | /-pig |---------| ---------| | /-rat | \--------| | \-chicken | \-human