skbio.tree.upgma#
- skbio.tree.upgma(dm, weighted=False)[source]#
Perform unweighted pair group method with arithmetic mean (UPGMA) or its weighted variant (WPGMA) for phylogenetic reconstruction.
- Parameters:
- dmskbio.DistanceMatrix
The input distance matrix.
- weightedbool, optional
If True, WPGMA is performed instead of UPGMA. WPGMA is a variant of UPGMA which is unbiased towards the size of subtrees computed.
- Returns:
- TreeNode
A TreeNode object with estimated edge values.
See also
Notes
UPGMA (unweighted pair group method with arithmetic mean) is a simple hierarchical clustering method that iteratively groups proximal taxa or taxon groups to form a tree structure. A weighted variant is known as WPGMA, and both variants are due to Sokal and Michener [1].
This function wraps SciPy’s
linkage
function, with themethod
parameter set as “average” (UPGMA) or “weighted” (WPGMA). It takes a scikit-bio DistanceMatrix object and returns a scikit-bio TreeNode object.UPGMA creates a rooted and ultrametric tree – all tips will have the same height (distance from the root node).
References
[1]Sokal, R.R., & Michener, C.D. (1958). A statistical method for evaluating systematic relationships. University of Kansas science bulletin, 38, 1409-1438.
Examples
Define a distance matrix object for the taxa a, b, and c.
>>> from skbio import DistanceMatrix
>>> data = [[0, 1, 2], ... [1, 0, 3], ... [2, 3, 0]] >>> ids = list('abc') >>> dm = DistanceMatrix(data, ids)
Construct a tree using UPGMA.
>>> tree = upgma(dm) >>> print(tree.ascii_art()) /-c ---------| | /-a \--------| \-b
The tree also has estimated edge values assigned to each edge.
>>> print(tree) (c:1.25,(a:0.5,b:0.5):0.75);