skbio.stats.composition.sbp_basis#
- skbio.stats.composition.sbp_basis(sbp)[source]#
Build an orthogonal basis from a sequential binary partition (SBP).
A SBP is a hierarchical collection of binary divisions of compositional parts ([1]). The child groups are divided again until all groups contain a single part. The SBP can be encoded in a \((D - 1) \times D\) matrix where, for each row, parts can be grouped by -1 and +1 tags, and 0 for excluded parts. The i-th balance is computed as follows:
\[b_i = \sqrt{ \frac{r_i s_i}{r_i+s_i} } \ln \left( \frac{g(x_{r_i})}{g(x_{s_i})} \right)\]where \(b_i\) is the i-th balance corresponding to the i-th row in the SBP, \(r_i\) and \(s_i\) and the number of respectively
+1
and-1
labels in the i-th row of the SBP and where \(g(x) = (\prod\limits_{i=1}^{D} x_i)^{1/D}\) is the geometric mean of \(x\).- Parameters:
- sbparray_like of shape (n_partitions, n_features)
A contrast matrix, also known as a sequential binary partition, where every row represents a partition between two groups of features. A part labelled
+1
would correspond to that feature being in the numerator of the given row partition, a part labelled-1
would correspond to features being in the denominator of that given row partition, and0
would correspond to features excluded in the row partition.
- Returns:
- ndarray of shape (n_partitions, n_features)
An orthonormal basis in the Aitchison simplex.
Notes
The
sbp_basis
method was derived from thegsi.buildilrBase()
function implemented in the R package “compositions” [2].References
[1]Parent, S.É., Parent, L.E., Egozcue, J.J., Rozane, D.E., Hernandes, A., Lapointe, L., Hébert-Gentile, V., Naess, K., Marchand, S., Lafond, J., Mattos, D., Barlow, P., Natale, W., 2013. The plant ionome revisited by the nutrient balance concept. Front. Plant Sci. 4, 39.
[2]van den Boogaart, K. Gerald, Tolosana-Delgado, Raimon and Bren, Matevz, 2014. compositions: Compositional Data Analysis. R package version 1.40-1. https://CRAN.R-project.org/package=compositions.
Examples
>>> import numpy as np >>> sbp = np.array([[1, 1,-1,-1,-1], ... [1,-1, 0, 0, 0], ... [0, 0, 1,-1,-1], ... [0, 0, 0, 1,-1]]) ... >>> sbp_basis(sbp) array([[ 0.54772256, 0.54772256, -0.36514837, -0.36514837, -0.36514837], [ 0.70710678, -0.70710678, 0. , 0. , 0. ], [ 0. , 0. , 0.81649658, -0.40824829, -0.40824829], [ 0. , 0. , 0. , 0.70710678, -0.70710678]])