skbio.stats.ordination.MMvecResult#
- class skbio.stats.ordination.MMvecResult(estimator)[source]#
Result of an MMvec analysis.
This class contains the learned embeddings and co-occurrence patterns from fitting an MMvec model. It enables both interpretation (which conditioning (X) and conditioned (Y) features co-occur with each other) and prediction (expected Y composition given X composition).
- Attributes:
x_embeddingstable_like of shape (n_features_x, n_dimensions + 1)Learned coordinates of conditioning features (X) in latent space.
y_embeddingstable_like of shape (n_features_y, n_dimensions + 1)Learned coordinates of conditioned features (Y) in latent space.
rankstable_like of shape (n_features_x, n_features_y)Row-centered log conditional probability matrix.
probstable_like of shape (n_features_x, n_features_y)Conditional probability matrix of co-occurrence of X and Y features.
convergencetable_like of shape (n_iterations,)Loss (negative log-posterior) over iterations during training.
See also
Notes
Detecting overfitting with Q-squared
Overfitting occurs when the model memorizes training data rather than learning generalizable patterns. To detect overfitting:
Split your data into training and test sets before fitting.
Fit the model on training data only.
Use
scoreto compute \(Q^2\) on held-out test data.
Interpretation of \(Q^2\) values:
Close to 1: Excellent predictive performance.
Close to 0: Model predicts no better than the mean.
Negative: Model performs worse than predicting the mean, indicating overfitting or model misspecification.
If \(Q^2\) is much lower than expected, try:
Reducing
dimensions(fewer latent dimensions).Increasing regularization via smaller
x_prior_scaleandy_prior_scalevalues.Collecting more training samples.
Embedding Interpretation
The embeddings place X and Y features in the same latent space. The inner product between an X embedding vector and a Y embedding vector (plus their bias terms) gives the log-odds of their co-occurrence:
\[\log \frac{P(m_j | \mu_i)}{P(m_{\text{ref}} | \mu_i)} = X_i \cdot Y_j + b_{X_i} + b_{Y_j}\]This means that X and Y features pointing in similar directions associate with each other, and the angle between a pair of X and Y vectors indicates their association strength.
Attributes
Loss (negative log-posterior) over iterations during training.
Conditional probability matrix of co-occurrence of X and Y features.
Row-centered log conditional probability matrix.
Learned coordinates of conditioning features (X) in latent space.
Learned coordinates of conditioned features (Y) in latent space.
Methods
Predict conditioned feature compositions given conditioning features.
Compute Q-squared (coefficient of prediction) on held-out data.
Special methods
Return str(self).
Special methods (inherited)
__eq__Return self==value.
__ge__Return self>=value.
__getstate__Helper for pickle.
__gt__Return self>value.
__hash__Return hash(self).
__le__Return self<=value.
__lt__Return self<value.
__ne__Return self!=value.
Details
- convergence#
Loss (negative log-posterior) over iterations during training.
Use this to diagnose training issues. The loss should generally decrease and stabilize. If the loss is still decreasing at the final iteration, consider increasing
max_iter. If the loss oscillates (Adam optimizer), try reducinglearning_rate.
- probs#
Conditional probability matrix of co-occurrence of X and Y features.
Entry (i, j) represents the probability of observing \(Y_j\) given \(X_i\). Each row sums to 1. This matrix is derived from the softmax transformation of the
ranksmatrix and is cached lazily.
- ranks#
Row-centered log conditional probability matrix.
Entry (i, j) represents the log-odds of observing \(Y_j\) given \(X_i\), relative to the row mean. Higher values indicate stronger positive associations. This matrix is row-centered (each row sums to 0) for identifiability.
The actual conditional probabilities (see
probs) can be obtained by transforming this matrix withclr_inv(a.k.a. softmax).
- x_embeddings#
Learned coordinates of conditioning features (X) in latent space.
Each row is a vector representation of an X feature. Features with similar embeddings tend to co-occur with similar sets of Y features. The Euclidean distance or cosine similarity between embeddings can be used to identify feature relatedness. The last column (+1; “bias”) captures the baseline tendency of each X feature to associate with Y features overall.
- y_embeddings#
Learned coordinates of conditioned features (Y) in latent space.
See
x_embeddings. The first row is all zeros and represents the reference Y feature used for identifiability. The last column (+1; “bias”) captures the baseline abundance of each Y feature.