skbio.tree.nj#
- skbio.tree.nj(dm, disallow_negative_branch_length=True, result_constructor=None)[source]#
Perform neighbor joining (NJ) for phylogenetic reconstruction.
- Parameters:
- dmskbio.DistanceMatrix
Input distance matrix containing distances between taxa.
- disallow_negative_branch_lengthbool, optional
Neighbor joining can result in negative branch lengths, which don’t make sense in an evolutionary context. If True, negative branch lengths will be returned as zero, a common strategy for handling this issue that was proposed by the original developers of the algorithm.
- result_constructorfunction, optional
Function to apply to construct the result object. This must take a newick-formatted string as input. The result of applying this function to a newick-formatted string will be returned from this function. This defaults to
lambda x: TreeNode.read(StringIO(x), format='newick')
.
- Returns:
- TreeNode
By default, the result object is a TreeNode, though this can be overridden by passing result_constructor.
See also
Notes
Neighbor joining was initially described in Saitou and Nei (1987) [1]. The example presented here is derived from the Wikipedia page on neighbor joining [2]. Gascuel and Steel (2006) provide a detailed overview of Neighbor joining in terms of its biological relevance and limitations [3].
Neighbor joining, by definition, creates unrooted trees. One strategy for rooting the resulting trees is midpoint rooting, which is accessible as
TreeNode.root_at_midpoint
.References
[1]Saitou N, and Nei M. (1987) “The neighbor-joining method: a new method for reconstructing phylogenetic trees.” Molecular Biology and Evolution. PMID: 3447015.
[3]Gascuel O, and Steel M. (2006) “Neighbor-Joining Revealed” Molecular Biology and Evolution, Volume 23, Issue 11, November 2006, Pages 1997–2000, https://doi.org/10.1093/molbev/msl072
Examples
Define a new distance matrix object describing the distances between five taxa: a, b, c, d, and e.
>>> from skbio import DistanceMatrix >>> from skbio.tree import nj
>>> data = [[0, 5, 9, 9, 8], ... [5, 0, 10, 10, 9], ... [9, 10, 0, 8, 7], ... [9, 10, 8, 0, 3], ... [8, 9, 7, 3, 0]] >>> ids = list('abcde') >>> dm = DistanceMatrix(data, ids)
Construct the neighbor joining tree representing the relationship between those taxa. This is returned as a TreeNode object.
>>> tree = nj(dm) >>> print(tree.ascii_art()) /-d | | /-c |---------| ---------| | /-b | \--------| | \-a | \-e
Again, construct the neighbor joining tree, but instead return the newick string representing the tree, rather than the TreeNode object. (Note that in this example the string output is truncated when printed to facilitate rendering.)
>>> newick_str = nj(dm, result_constructor=str) >>> print(newick_str[:55], "...") (d:2.000000, (c:4.000000, (b:3.000000, a:2.000000):3.00 ...
Notice that the tree constructed using neighbor joining is not rooted at a leaf node, unlike minimum evolution, so re-rooting is required before nearest neighbor interchange can be performed.