skbio.diversity.beta_diversity#

skbio.diversity.beta_diversity(metric, counts, ids=None, validate=True, pairwise_func=None, **kwargs)[source]#

Compute distances between all pairs of samples.

Parameters:

metricstr or callable: The beta diversity metric, i.e., a pairwise distance function to apply to the sample(s). See beta and SciPy’s pdist for available metrics. Passing metric as a string is preferable as this often results in an optimized version of the metric being used.
countstable_like of shape (n_samples, n_taxa) or (n_taxa,): Vector or matrix containing count/abundance data of one or multiple samples. See supported formats.
idsarray_like of shape (n_samples,), optional: Identifiers for each sample in counts. If not provided, will extract sample IDs from counts, if available, or assign integer identifiers in the order that samples were provided.
validate: bool, optional: If True (default), validate the input data before applying the alpha diversity metric. See skbio.diversity for the details of validation.
pairwise_funccallable, optional: The function to use for computing pairwise distances. Must take counts and metric and return a square, hollow, 2-D float array of dissimilarities. Examples of functions that can be provided are SciPy’s pdist (default) and scikit-learn’s pairwise_distances.
kwargsdict, optional: Metric-specific parameters. Refer to the documentation of the chosen metric. A special parameter is taxa, needed by some phylogenetic metrics. If not provided, will extract taxa (feature IDs) from counts, if available, and pass to the metric.

Returns:

DistanceMatrix: Distances between all pairs of samples (i.e., rows). The number of rows and columns will be equal to the number of rows in counts.

Raises:

ValueError, MissingNodeError, DuplicateNodeError: If validation fails. Exact error will depend on what was invalid.
Any Exception: If invalid method-specific parameters are provided.