BIOM-Format (skbio.io.format.biom)#

The BIOM-Format (format v2.1.0) is an HDF5-based format to represent sample/feature counts or relative abundances. It is designed specifically for sparse data. Internally, it stores the data in both compressed sparse row, and compressed sparse column representation. It additionally has support for representing sample and feature metadata.

Note

Internally, BIOM describes features as “observations,” which differs from scikit-bio’s standard terminology. Throughout scikit-bio documentation and APIs, these are consistently referred to as “features.” For more details about terminology differences across formats, see the terminology section of the table-like documentation.

Format Support#

Has Sniffer: Yes

Reader

Writer

Object Class

Yes

Yes

skbio.table.Table

Format Specification#

The official format specification for BIOM-Format can be found at [1].

Examples#

Here we will write an existing BIOM table, and re-read it. Note that the Table from biom implicitly gets the .write method from the IO registry. This ByteIO object can be a file path in a regular use case.

>>> import io, skbio
>>> f = io.BytesIO()
>>> skbio.table.example_table.write(f)
<_io.BytesIO object at ...>
>>> roundtrip = skbio.read(f, into=skbio.Table)
>>> roundtrip
2 x 3 <class 'biom.table.Table'> with 5 nonzero entries (83% dense)

References#