skbio.table.Table#

class skbio.table.Table(data, observation_ids, sample_ids, observation_metadata=None, sample_metadata=None, table_id=None, type=None, create_date=None, generated_by=None, observation_group_metadata=None, sample_group_metadata=None, validate=True, observation_index=None, sample_index=None, **kwargs)[source]#

The (canonically pronounced ‘teh’) Table.

Give in to the power of the Table!

Creates an in-memory representation of a BIOM file. BIOM version 1.0 is based on JSON to provide the overall structure for the format while versions 2.0 and 2.1 are based on HDF5. For more information see [1] and [2]

Attributes:
shape

The shape of the underlying contingency matrix

dtype

The type of the objects in the underlying contingency matrix

nnz

Number of non-zero elements of the underlying contingency matrix

matrix_data

The sparse matrix object

type
table_id
create_date
generated_by
format_version
Raises:
TableException

When an invalid table type is provided.

Notes

Allowed table types are None, “OTU table”, “Pathway table”, “Function table”, “Ortholog table”, “Gene table”, “Metabolite table”, “Taxon table”

References

[2]

D. McDonald, et al. “The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome” GigaScience 2012 1:7

Attributes

default_write_format

dtype

The type of the objects in the underlying contingency matrix

matrix_data

The sparse matrix object

nnz

Number of non-zero elements of the underlying contingency matrix

shape

The shape of the underlying contingency matrix

Methods

add_group_metadata(group_md[, axis])

Take a dict of group metadata and add it to an axis

add_metadata(md[, axis])

Take a dict of metadata and add it to an axis.

align_to(other[, axis])

Align self to other over a requested axis

align_to_dataframe(metadata[, axis])

Aligns dataframe against biom table, only keeping common ids.

align_tree(tree[, axis])

Aligns biom table against tree, only keeping common ids.

collapse(f[, collapse_f, norm, ...])

Collapse partitions in a table by metadata or by IDs

concat(others[, axis])

Concatenate tables if axis is disjoint

copy()

Returns a copy of the table

data(id[, axis, dense])

Returns data associated with an id

del_metadata([keys, axis])

Remove metadata from an axis

delimited_self([delim, header_key, ...])

Return self as a string in a delimited form

descriptive_equality(other)

For use in testing, describe how the tables are not equal

exists(id[, axis])

Returns whether id exists in axis

filter(ids_to_keep[, axis, invert, inplace])

Filter a table based on a function or iterable.

from_adjacency(lines)

Parse an adjacency format into BIOM

from_hdf5(h5grp[, ids, axis, parse_fs, ...])

Parse an HDF5 formatted BIOM table

from_json(json_table[, data_pump, ...])

Parse a biom otu table type

from_tsv(lines, obs_mapping, sample_mapping, ...)

Parse a tab separated (observation x sample) formatted BIOM table

get_table_density()

Returns the fraction of nonzero elements in the table.

get_value_by_ids(obs_id, samp_id)

Return value in the matrix corresponding to (obs_id, samp_id)

group_metadata([axis])

Return the group metadata of the given axis

head([n, m])

Get the first n rows and m columns from self

ids([axis])

Return the ids along the given axis

index(id, axis)

Return the index of the identified sample/observation.

is_empty()

Check whether the table is empty

iter([dense, axis])

Yields (value, id, metadata)

iter_data([dense, axis])

Yields axis values

iter_pairwise([dense, axis, tri, diag])

Pairwise iteration over self

length([axis])

Return the length of an axis

max([axis])

Get the maximum nonzero value over an axis

merge(other[, sample, observation, ...])

Merge two tables together

metadata([id, axis])

Return the metadata of the identified sample/observation.

metadata_to_dataframe(axis)

Convert axis metadata to a Pandas DataFrame

min([axis])

Get the minimum nonzero value over an axis

nonzero()

Yields locations of nonzero elements within the data matrix

nonzero_counts(axis[, binary])

Get nonzero summaries about an axis

norm([axis, inplace])

Normalize in place sample values by an observation, or vice versa.

pa([inplace])

Convert the table to presence/absence data

partition(f[, axis, remove_empty, ignore_none])

Yields partitions

rankdata([axis, inplace, method])

Convert values to rank abundances from smallest to largest

read(file[, format])

Create a new Table instance from a file.

reduce(f, axis)

Reduce over axis using function f

remove_empty([axis, inplace])

Remove empty samples or observations from the table

sort([sort_f, axis])

Return a table sorted along axis

sort_order(order[, axis])

Return a new table with axis in order

subsample(n[, axis, by_id, ...])

Randomly subsample without replacement.

sum([axis])

Returns the sum by axis

to_anndata([dense, dtype, transpose])

Convert Table to AnnData format

to_dataframe([dense])

Convert matrix data to a Pandas SparseDataFrame or DataFrame

to_hdf5(h5grp, generated_by[, compress, ...])

Store CSC and CSR in place

to_json(generated_by[, direct_io, creation_date])

Returns a JSON string representing the table in BIOM format.

to_tsv([header_key, header_value, ...])

Return self as a string in tab delimited form

transform(f[, axis, inplace])

Iterate over axis, applying a function f to each vector.

transpose()

Transpose the contingency table

update_ids(id_map[, axis, strict, inplace])

Update the ids along the given axis.

write(file[, format])

Write an instance of Table to a file.

Special methods

__eq__(other)

Equality is determined by the data matrix, metadata, and IDs

__getitem__(args)

Handles row or column slices

__iter__()

See biom.table.Table.iter

__ne__(other)

Return self!=value.

__str__()

Stringify self

Special methods (inherited)

__ge__(value, /)

Return self>=value.

__getstate__(/)

Helper for pickle.

__gt__(value, /)

Return self>value.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

Details

default_write_format = 'biom'#
dtype#

The type of the objects in the underlying contingency matrix

matrix_data#

The sparse matrix object

nnz#

Number of non-zero elements of the underlying contingency matrix

shape#

The shape of the underlying contingency matrix

__eq__(other)[source]#

Equality is determined by the data matrix, metadata, and IDs

__getitem__(args)[source]#

Handles row or column slices

Slicing over an individual axis is supported, but slicing over both axes at the same time is not supported. Partial slices, such as foo[0, 5:10] are not supported, however full slices are supported, such as foo[0, :].

Parameters:
argstuple or slice

The specific element (by index position) to return or an entire row or column of the data.

Returns:
float or spmatrix

A float is return if a specific element is specified, otherwise a spmatrix object representing a vector of sparse data is returned.

Raises:
IndexError
  • If the matrix is empty

  • If the arguments do not appear to be a tuple

  • If a slice on row and column is specified

  • If a partial slice is specified

Notes

Switching between slicing rows and columns is inefficient. Slicing of rows requires a CSR representation, while slicing of columns requires a CSC representation, and transforms are performed on the data if the data are not in the required representation. These transforms can be expensive if done frequently.

__iter__()[source]#

See biom.table.Table.iter

__ne__(other)[source]#

Return self!=value.

__str__()[source]#

Stringify self

Default str output for a Table is just row/col ids and data values