skbio.stats.ordination.rda#

skbio.stats.ordination.rda(y, x, scale_Y=False, scaling=1, sample_ids=None, feature_ids=None, constraint_ids=None, output_format=None)[source]#

Compute redundancy analysis, a type of canonical analysis.

It is related to PCA and multiple regression because the explained variables y are fitted to the explanatory variables x and PCA is then performed on the fitted values. A similar process is performed on the residuals.

RDA should be chosen if the studied gradient is small, and CCA when it’s large, so that the contingency table is sparse.

Parameters:

ytable_like

\(n \times p\) response matrix, where \(n\) is the number of samples and \(p\) is the number of features. Its columns need be dimensionally homogeneous (or you can set scale_Y=True). This matrix is also referred to as the community matrix that commonly stores information about species abundances. See supported formats.

xtable_like

\(n \times m, n \geq m\) matrix of explanatory variables, where \(n\) is the number of samples and \(m\) is the number of metadata variables. Its columns need not be standardized, but doing so turns regression coefficients into standard regression coefficients. See above.

scale_Ybool, optional

Controls whether the response matrix columns are scaled to have unit standard deviation. Defaults to False.

scalingint

Scaling type 1 produces a distance biplot. It focuses on the ordination of rows (samples) because their transformed distances approximate their original euclidean distances. Especially interesting when most explanatory variables are binary. Scaling type 2 produces a correlation biplot. It focuses on the relationships among explained variables (y). It is interpreted like scaling type 1, but taking into account that distances between objects don’t approximate their euclidean distances.

See more details about distance and correlation biplots in [1], S 9.1.4.

constraint_idslist of str, optional

List of identifiers for metadata variables or constraints (applicable in constrained ordination methods). If not provided implicitly by the input data structure or explicitly by the user, defaults to integers starting at zero.

sample_ids, feature_ids, output_formatoptional

Standard table parameters. See Common parameters for details.

Returns:

OrdinationResults: Object that stores the computed eigenvalues, the proportion explained by each of them (per unit), transformed coordinates for feature and samples, biplot scores, sample constraints, etc.

Raises:

ValueError: If the data matrices have different numbers of rows.
ValueError: If explanatory variables have less rows than columns.