skbio.stats.ordination.MMvecResults.predict#

MMvecResults.predict(microbes)[source]#

Predict metabolite distributions given microbe abundances.

Computes the expected metabolite distribution for each sample by marginalizing over microbe compositions:

P(metabolite) = sum_i P(microbe_i) * P(metabolite | microbe_i)

Parameters:
microbespd.DataFrame or array-like of shape (n_samples, n_microbes)

Microbe abundance counts. Columns must match the microbes used during training.

Returns:
predictionspd.DataFrame

Predicted metabolite proportions for each sample. Shape: (n_samples, n_metabolites). Each row sums to 1.

Examples

>>> from skbio.stats.ordination import mmvec
>>> import numpy as np
>>> import pandas as pd
>>> np.random.seed(42)
>>> microbes = pd.DataFrame(
...     np.random.randint(1, 50, size=(20, 5)),
...     columns=[f'OTU_{i}' for i in range(5)]
... )
>>> metabolites = pd.DataFrame(
...     np.random.randint(1, 50, size=(20, 8)),
...     columns=[f'met_{i}' for i in range(8)]
... )
>>> result = mmvec(microbes, metabolites, n_components=2, max_iter=10)
>>> # Predict on new samples
>>> new_microbes = pd.DataFrame(
...     np.random.randint(1, 50, size=(5, 5)),
...     columns=[f'OTU_{i}' for i in range(5)]
... )
>>> predictions = result.predict(new_microbes)
>>> predictions.shape
(5, 8)
>>> np.allclose(predictions.sum(axis=1), 1.0)
True