skbio.stats.ordination.MMvecResults.score#
- MMvecResults.score(microbes, metabolites)[source]#
Compute Q² (coefficient of prediction) on held-out data.
Q² measures predictive performance on test data, analogous to R² but for cross-validation. Values range from -inf to 1, where 1 indicates perfect prediction and 0 indicates prediction no better than the mean.
\[Q^2 = 1 - \frac{SS_{res}}{SS_{tot}} = 1 - \frac{\sum(y - \hat{y})^2}{\sum(y - \bar{y}_j)^2}\]where \(\bar{y}_j\) is the per-metabolite mean across samples.
- Parameters:
- microbespd.DataFrame or array-like of shape (n_samples, n_microbes)
Test microbe abundance counts.
- metabolitespd.DataFrame or array-like of shape (n_samples, n_metabolites)
Test metabolite abundance counts.
- Returns:
- q2float
Q² score. Higher is better, with 1.0 being perfect prediction.
See also
predictPredict metabolite distributions.
probabilitiesGet conditional probability matrix.
Examples
>>> from skbio.stats.ordination import mmvec >>> import numpy as np >>> import pandas as pd >>> np.random.seed(42) >>> # Training data >>> microbes = pd.DataFrame( ... np.random.randint(1, 50, size=(30, 5)), ... columns=[f'OTU_{i}' for i in range(5)] ... ) >>> metabolites = pd.DataFrame( ... np.random.randint(1, 50, size=(30, 8)), ... columns=[f'met_{i}' for i in range(8)] ... ) >>> result = mmvec(microbes, metabolites, n_components=2, max_iter=50) >>> # Evaluate on test data >>> test_microbes = pd.DataFrame( ... np.random.randint(1, 50, size=(10, 5)), ... columns=[f'OTU_{i}' for i in range(5)] ... ) >>> test_metabolites = pd.DataFrame( ... np.random.randint(1, 50, size=(10, 8)), ... columns=[f'met_{i}' for i in range(8)] ... ) >>> q2 = result.score(test_microbes, test_metabolites) >>> isinstance(q2, float) True