skbio.tree.TreeNode.cache_attr#
- TreeNode.cache_attr(func, cache_attrname, cache_type=<class 'list'>, register=True)[source]#
Cache attributes on nodes of the tree through a postorder traversal.
- Parameters:
- funccallable
Function to calculate the attribute of the current node. The result will be combined with the attributes of the previous nodes, if applicable.
- cache_attrnamestr
Name of the attribute to be attached to each node.
- cache_type{list, tuple, set, frozenset}, callable, or None
The type of the cache. Can be any of the four iterable types: list (default), tuple, set, or frozenset. In these cases, combination of attributes of the node’s children and itself will be automated.
Or a custom function that takes two arguments: list of attributes of its children, and attribute calculated from itself by
func
, and returns the combined attribute of the node.Or None, in which case combination of attributes of children and self will not take place, unless explicitly customized within
func
.Changed in version 0.6.3: Tuple, custom function and None were added to the options.
- registerbool, optional
Whether to register the attribute name as a cache of the tree, such that the attributes will be deleted from all nodes when the tree is manipulated or the
clear_caches
method is explicitly invoked. Default is True.Added in version 0.6.3.
- Raises:
- TypeError
If
cache_type
is invalid.
See also
Notes
This method provides an efficient interface to assign a custom attribute to every node of a tree through one postorder travesal. It is particularly useful if one needs to frequently look up attributes that would normally require one traversal of the tree per lookup. The assigned attributes may be automatically deleted when the tree is manipulated.
Examples
This method facilitates evaluation for various useful node properties. Some representative examples are provided below.
>>> from skbio import TreeNode >>> tree = TreeNode.read(["((a:1.2,b:1.6)c:0.3,(d:0.8,e:1.0)f:0.6)g;"]) >>> print(tree.ascii_art()) /-a /c-------| | \-b -g-------| | /-d \f-------| \-e
Cache a list of all descending tip names on each node. This faciliates the assignment of taxon set under each clade in the tree. It resembles but is more efficient than calling
subset()
multiple times.>>> f = lambda n: [n.name] if n.is_tip() else [] >>> tree.cache_attr(f, 'tip_names') >>> for node in tree.traverse(include_self=True): ... print(f"Node: {node.name}, tips: {node.tip_names}") Node: g, tips: ['a', 'b', 'd', 'e'] Node: c, tips: ['a', 'b'] Node: a, tips: ['a'] Node: b, tips: ['b'] Node: f, tips: ['d', 'e'] Node: d, tips: ['d'] Node: e, tips: ['e']
Cache the number of nodes per clade. The function
sum
is used in place of cache type such that the count will be accumulated. This resembles but is more efficient than callingcount()
multiple times.>>> f = lambda n: 1 >>> tree.cache_attr(f, 'node_count', sum) >>> tree.node_count 7
Cache the sum of branch lengths per clade. This resembles but is more efficient than calling
total_length()
multiple times.>>> f = lambda n: n.length or 0.0 >>> tree.cache_attr(f, 'clade_size', sum) >>> tree.clade_size 5.5
Cache the accumulative distances from all tips to the common ancestor of each clade. This is more efficient than calling
depth()
multiple times. One can further apply calculations like mean and standard deviation to the results.>>> import numpy as np >>> dist_f = lambda n: np.array(n.length or 0.0, ndmin=1) >>> comb_f = lambda prev, curr: np.concatenate(prev) + curr if prev else curr >>> tree.cache_attr(dist_f, 'accu_dists', comb_f) >>> tree.accu_dists array([ 1.5, 1.9, 1.4, 1.6])