skbio.table.Table#
- class skbio.table.Table(data, observation_ids, sample_ids, observation_metadata=None, sample_metadata=None, table_id=None, type=None, create_date=None, generated_by=None, observation_group_metadata=None, sample_group_metadata=None, validate=True, observation_index=None, sample_index=None, **kwargs)[source]#
The (canonically pronounced ‘teh’) Table.
Give in to the power of the Table!
Creates an in-memory representation of a BIOM file. BIOM version 1.0 is based on JSON to provide the overall structure for the format while versions 2.0 and 2.1 are based on HDF5. For more information see [1] and [2]
- Parameters:
- dataarray_like
An (N,M) sample by observation matrix represented as one of these types: * An 1-dimensional array of values * An n-dimensional array of values * An empty list * A list of numpy arrays * A list of dict * A list of sparse matrices * A dictionary of values * A list of lists * A sparse matrix of values
- observation_idsarray_like of str
A (N,) dataset of the observation IDs, where N is the total number of IDs
- sample_idsarray_like of str
A (M,) dataset of the sample IDs, where M is the total number of IDs
- observation_metadatalist of dicts, optional
per observation dictionary of annotations where every key represents a metadata field that contains specific metadata information, ie taxonomy, KEGG pathway, etc
- sample_metadataarray_like of dicts, optional
per sample dictionary of annotations where every key represents a metadata field that contains sample specific metadata information, ie
- table_idstr, optional
A field that can be used to identify the table
- typestr, see notes
The type of table represented
- create_datestr, optional
Date that this table was built
- generated_bystr, optional
Individual who built the table
- observation_group_metadatalist, optional
group that contains observation specific group metadata information (e.g., phylogenetic tree)
- sample_group_metadatalist, optional
group that contains sample specific group metadata information (e.g., relationships between samples)
- Attributes:
shapeThe shape of the underlying contingency matrix
dtypeThe type of the objects in the underlying contingency matrix
nnzNumber of non-zero elements of the underlying contingency matrix
matrix_dataThe sparse matrix object
- type
- table_id
- create_date
- generated_by
- format_version
- Raises:
- TableException
When an invalid table type is provided.
Notes
Allowed table types are None, “OTU table”, “Pathway table”, “Function table”, “Ortholog table”, “Gene table”, “Metabolite table”, “Taxon table”
References
[2]D. McDonald, et al. “The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome” GigaScience 2012 1:7
Attributes
The type of the objects in the underlying contingency matrix
The sparse matrix object
Number of non-zero elements of the underlying contingency matrix
The shape of the underlying contingency matrix
Methods
Take a dict of group metadata and add it to an axis
Take a dict of metadata and add it to an axis.
Align self to other over a requested axis
Aligns dataframe against biom table, only keeping common ids.
Aligns biom table against tree, only keeping common ids.
Collapse partitions in a table by metadata or by IDs
Concatenate tables if axis is disjoint
Returns a copy of the table
Returns data associated with an id
Remove metadata from an axis
Return self as a string in a delimited form
For use in testing, describe how the tables are not equal
Returns whether id exists in axis
Filter a table based on a function or iterable.
Parse an adjacency format into BIOM
Parse an HDF5 formatted BIOM table
Parse a biom otu table type
Parse a tab separated (observation x sample) formatted BIOM table
Returns the fraction of nonzero elements in the table.
Return value in the matrix corresponding to
(obs_id, samp_id)Return the group metadata of the given axis
Get the first n rows and m columns from self
Return the ids along the given axis
Return the index of the identified sample/observation.
Check whether the table is empty
Yields
(value, id, metadata)Yields axis values
Pairwise iteration over self
Return the length of an axis
Get the maximum nonzero value over an axis
Merge two tables together
Return the metadata of the identified sample/observation.
Convert axis metadata to a Pandas DataFrame
Get the minimum nonzero value over an axis
Yields locations of nonzero elements within the data matrix
Get nonzero summaries about an axis
Normalize in place sample values by an observation, or vice versa.
Convert the table to presence/absence data
Yields partitions
Convert values to rank abundances from smallest to largest
Create a new
Tableinstance from a file.Reduce over axis using function f
Remove empty samples or observations from the table
Return a table sorted along axis
Return a new table with axis in order
Randomly subsample without replacement.
Returns the sum by axis
Convert Table to AnnData format
Convert matrix data to a Pandas SparseDataFrame or DataFrame
Store CSC and CSR in place
Returns a JSON string representing the table in BIOM format.
Return self as a string in tab delimited form
Iterate over axis, applying a function f to each vector.
Transpose the contingency table
Update the ids along the given axis.
Write an instance of
Tableto a file.Special methods
Equality is determined by the data matrix, metadata, and IDs
Handles row or column slices
See
biom.table.Table.iterReturn self!=value.
Stringify self
Special methods (inherited)
__ge__Return self>=value.
__getstate__Helper for pickle.
__gt__Return self>value.
__le__Return self<=value.
__lt__Return self<value.
Details
- default_write_format = 'biom'#
- dtype#
The type of the objects in the underlying contingency matrix
- matrix_data#
The sparse matrix object
- nnz#
Number of non-zero elements of the underlying contingency matrix
- shape#
The shape of the underlying contingency matrix
- __getitem__(args)[source]#
Handles row or column slices
Slicing over an individual axis is supported, but slicing over both axes at the same time is not supported. Partial slices, such as foo[0, 5:10] are not supported, however full slices are supported, such as foo[0, :].
- Parameters:
- argstuple or slice
The specific element (by index position) to return or an entire row or column of the data.
- Returns:
- float or spmatrix
A float is return if a specific element is specified, otherwise a spmatrix object representing a vector of sparse data is returned.
- Raises:
- IndexError
If the matrix is empty
If the arguments do not appear to be a tuple
If a slice on row and column is specified
If a partial slice is specified
Notes
Switching between slicing rows and columns is inefficient. Slicing of rows requires a CSR representation, while slicing of columns requires a CSC representation, and transforms are performed on the data if the data are not in the required representation. These transforms can be expensive if done frequently.