skbio.stats.composition.perturb#

skbio.stats.composition.perturb(x, y)[source]#

Perform the perturbation operation.

This operation is defined as:

\[x \oplus y = C[x_1 y_1, \ldots, x_D y_D]\]

\(C[x]\) is the closure operation defined as:

\[C[x] = \left[\frac{x_1}{\sum_{i=1}^{D} x_i},\ldots, \frac{x_D}{\sum_{i=1}^{D} x_i} \right]\]

for some \(D\) dimensional real vector \(x\) and \(D\) is the number of components for every composition.

Parameters:
xarray_like of shape (n_compositions, n_components)

A matrix of proportions.

yarray_like of shape (n_compositions, n_components)

A matrix of proportions.

Returns:
ndarray of shape (n_compositions, n_components)

A matrix of proportions where all of the values are non-zero and each composition (row) adds up to 1.

Examples

>>> import numpy as np
>>> from skbio.stats.composition import perturb

Consider a very simple environment with only three species. The species in the environment are evenly distributed and their proportions are equal:

>>> before = np.array([1/3, 1/3, 1/3])

Suppose that an antibiotic kills off half of the population for the first two species, but doesn’t harm the third species. Then the perturbation vector would be as follows:

>>> after = np.array([1/2, 1/2, 1])

And the resulting perturbation would be:

>>> perturb(before, after)
array([ 0.25,  0.25,  0.5 ])