BIOM-Format (skbio.io.format.biom
)#
The BIOM-Format (format v2.1.0) is an HDF5-based format to represent sample/feature counts or relative abundances. It is designed specifically for sparse data. Internally, it stores the data in both compressed sparse row, and compressed sparse column representation. It additionally has support for representing sample and feature metadata.
Note
Internally, BIOM describes features as “observations,” which differs from scikit-bio’s standard terminology. Throughout scikit-bio documentation and APIs, these are consistently referred to as “features.” For more details about terminology differences across formats, see the terminology section of the table-like documentation.
Format Support#
Has Sniffer: Yes
Reader |
Writer |
Object Class |
---|---|---|
Yes |
Yes |
Format Specification#
The official format specification for BIOM-Format can be found at [1].
Examples#
Here we will write an existing BIOM table, and re-read it. Note that the Table
from biom
implicitly gets the .write
method from the IO registry. This
ByteIO
object can be a file path in a regular use case.
>>> import io, skbio
>>> f = io.BytesIO()
>>> skbio.table.example_table.write(f)
<_io.BytesIO object at ...>
>>> roundtrip = skbio.read(f, into=skbio.Table)
>>> roundtrip
2 x 3 <class 'biom.table.Table'> with 5 nonzero entries (83% dense)