skbio.stats.ordination.ca#

skbio.stats.ordination.ca(X, scaling=1, sample_ids=None, feature_ids=None, output_format=None)[source]#

Compute correspondence analysis.

Correspondence analysis is a multivariate statistical technique for ordination. In general, rows in the data table will correspond to samples and columns to features, but the method is symmetric. In order to measure the correspondence between rows and columns, the \(\chi^2\) distance is used, and those distances are preserved in the transformed space. The \(\chi^2\) distance doesn’t take double zeros into account, and so it is expected to produce better ordination that PCA when the data has lots of zero values.

It is related to Principal Component Analysis (PCA) but it should be preferred in the case of steep or long gradients, that is, when there are many zeros in the input data matrix.

Parameters:

Xtable_like of shape (n_samples, n_features): Input data table. See supported formats. Data must be non-negative and dimensionally homogeneous (numeric or binary).
scaling{1, 2}: Scaling type 1 maintains \(\chi^2\) distances between rows. Scaling type 2 preserves \(\chi^2\) distances between columns. For a more detailed explanation of the interpretation, check notes below and Legendre & Legendre 1998, section 9.4.3.
sample_ids, feature_ids, output_formatoptional: Standard table parameters. See Common parameters for details.

Returns:

OrdinationResults: Object that stores the computed eigenvalues, the transformed sample coordinates, the transformed features coordinates and the proportion explained.

Raises:

NotImplementedError: If the scaling value is not either 1 or 2.
ValueError: If any of the input matrix elements are negative.