skbio.tree.TreeNode.from_taxdump#
- classmethod TreeNode.from_taxdump(nodes, names=None)[source]#
Construct a tree from the NCBI taxonomy database.
- Parameters:
- nodespd.DataFrame
Taxon hierarchy
- namespd.DataFrame or dict, optional
Taxon names
- Returns:
- TreeNode
The constructed tree
- Raises:
- ValueError
If there is no top-level node
- ValueError
If there are more than one top-level node
See also
Notes
nodes
andnames
correspond to “nodes.dmp” and “names.dmp” of the NCBI taxonomy database. The should be read into data frames usingskbio.io.read
prior to this operation. Alternatively,names
may be provided as a dictionary. Ifnames
is omitted, taxonomy IDs be used as taxon names.Examples
>>> import pandas as pd >>> from skbio.tree import TreeNode >>> nodes = pd.DataFrame([ ... [1, 1, 'no rank'], ... [2, 1, 'domain'], ... [3, 1, 'domain'], ... [4, 2, 'phylum'], ... [5, 2, 'phylum']], columns=[ ... 'tax_id', 'parent_tax_id', 'rank']).set_index('tax_id') >>> names = {1: 'root', 2: 'Bacteria', 3: 'Archaea', ... 4: 'Firmicutes', 5: 'Bacteroidetes'} >>> tree = TreeNode.from_taxdump(nodes, names) >>> print(tree.ascii_art()) /-Firmicutes /Bacteria| -root----| \-Bacteroidetes | \-Archaea