skbio.stats.composition.sbp_basis#

skbio.stats.composition.sbp_basis(sbp)[source]#

Build an orthogonal basis from a sequential binary partition (SBP).

A SBP is a hierarchical collection of binary divisions of compositional parts ([1]). The child groups are divided again until all groups contain a single part. The SBP can be encoded in a \((D - 1) \times D\) matrix where, for each row, parts can be grouped by -1 and +1 tags, and 0 for excluded parts. The i-th balance is computed as follows:

\[b_i = \sqrt{ \frac{r_i s_i}{r_i+s_i} } \ln \left( \frac{g(x_{r_i})}{g(x_{s_i})} \right)\]

where \(b_i\) is the i-th balance corresponding to the i-th row in the SBP, \(r_i\) and \(s_i\) and the number of respectively +1 and -1 labels in the i-th row of the SBP and where \(g(x) = (\prod\limits_{i=1}^{D} x_i)^{1/D}\) is the geometric mean of \(x\).

Parameters:
sbparray_like of shape (n_partitions, n_features)

A contrast matrix, also known as a sequential binary partition, where every row represents a partition between two groups of features. A part labelled +1 would correspond to that feature being in the numerator of the given row partition, a part labelled -1 would correspond to features being in the denominator of that given row partition, and 0 would correspond to features excluded in the row partition.

Returns:
ndarray of shape (n_partitions, n_features)

An orthonormal basis in the Aitchison simplex.

Notes

The sbp_basis method was derived from the gsi.buildilrBase() function implemented in the R package “compositions” [2].

References

[1]

Parent, S.É., Parent, L.E., Egozcue, J.J., Rozane, D.E., Hernandes, A., Lapointe, L., Hébert-Gentile, V., Naess, K., Marchand, S., Lafond, J., Mattos, D., Barlow, P., Natale, W., 2013. The plant ionome revisited by the nutrient balance concept. Front. Plant Sci. 4, 39.

[2]

van den Boogaart, K. Gerald, Tolosana-Delgado, Raimon and Bren, Matevz, 2014. compositions: Compositional Data Analysis. R package version 1.40-1. https://CRAN.R-project.org/package=compositions.

Examples

>>> import numpy as np
>>> sbp = np.array([[1, 1,-1,-1,-1],
...                 [1,-1, 0, 0, 0],
...                 [0, 0, 1,-1,-1],
...                 [0, 0, 0, 1,-1]])
...
>>> sbp_basis(sbp)
array([[ 0.54772256,  0.54772256, -0.36514837, -0.36514837, -0.36514837],
       [ 0.70710678, -0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.81649658, -0.40824829, -0.40824829],
       [ 0.        ,  0.        ,  0.        ,  0.70710678, -0.70710678]])