skbio.tree.TreeNode.subsets#

TreeNode.subsets(within=None, include_full=False, include_tips=False, map_to_length=False)[source]#

Return all subsets of taxa defined by nodes descending from self.

Parameters:
withiniterable of str, optional

A custom set of taxa to refine the result. Only taxa within it will be considered. If None (default), all taxa in the tree will be considered.

Added in version 0.6.3.

include_fullbool, optional

Whether to include a set of all taxa in the result. Default is False, as such a set provides no topological information.

Added in version 0.6.3.

include_tipsbool, optional

Whether to include subsets with only one taxon in the result. Default is False, as such sets provide no topological information.

Added in version 0.6.3.

map_to_lengthbool, optional

If True, return a mapping of subsets to their branch lengths. Missing branch lengths will be replaced with 0. Default is False.

Added in version 0.6.3.

Returns:
frozenset of frozenset of str, or

All subsets of taxa defined by nodes descending from self. Returned if map_to_length is False.

dict of {frozenset of str: float}

Mapping of all subsets of taxa to their branch lengths. Returned if map_to_length is True.

Notes

The returned value represents the tree as a set of nested sets, each of which representing a clade in the tree. It is useful for assessing topological patterns of a tree.

The returned value itself and each of its components (frozensets) are unordered and hashable, making it efficient for lookup and comparison. For example, one can check whether a group of taxa form a clade in the tree, regardless of its internal structure.

This method can be applied to both rooted and unrooted trees. However, the underlying assumption is that the direction of descendance is from the current node to the tips below. That is, the root of the tree, even if not explicitly defined, should be at or above the current node. This should be considered when applying this method to an unrooted tree. If such an assumption is not present, one should consider using biparts() instead.

This method operates on the subtree below the current node.

Examples

>>> from skbio import TreeNode
>>> tree = TreeNode.read(["((a,(b,c)d)e,(f,g)h)i;"])
>>> print(tree.ascii_art())
                    /-a
          /e-------|
         |         |          /-b
         |          \d-------|
-i-------|                    \-c
         |
         |          /-f
          \h-------|
                    \-g
>>> subsets = tree.subsets()
>>> for s in sorted(subsets, key=sorted):
...     print(sorted(s))
['a', 'b', 'c']
['b', 'c']
['f', 'g']
>>> {'a', 'b', 'c'} in subsets
True
>>> {'a', 'b'} in subsets
False