skbio.diversity.beta_diversity#

skbio.diversity.beta_diversity(metric, counts, ids=None, validate=True, pairwise_func=None, **kwargs)[source]#

Compute distances between all pairs of samples.

Parameters:
metricstr or callable

The beta diversity metric, i.e., a pairwise distance function to apply to the sample(s). See beta and SciPy’s pdist for available metrics. Passing metric as a string is preferable as this often results in an optimized version of the metric being used.

countstable_like of shape (n_samples, n_taxa) or (n_taxa,)

Vector or matrix containing count/abundance data of one or multiple samples. See supported formats.

idsarray_like of shape (n_samples,), optional

Identifiers for each sample in counts. If not provided, will extract sample IDs from counts, if available, or assign integer identifiers in the order that samples were provided.

validate: bool, optional

If True (default), validate the input data before applying the alpha diversity metric. See skbio.diversity for the details of validation.

pairwise_funccallable, optional

The function to use for computing pairwise distances. Must take counts and metric and return a square, hollow, 2-D float array of dissimilarities. Examples of functions that can be provided are SciPy’s pdist (default) and scikit-learn’s pairwise_distances.

kwargsdict, optional

Metric-specific parameters. Refer to the documentation of the chosen metric. A special parameter is taxa, needed by some phylogenetic metrics. If not provided, will extract taxa (feature IDs) from counts, if available, and pass to the metric.

Returns:
DistanceMatrix

Distances between all pairs of samples (i.e., rows). The number of rows and columns will be equal to the number of rows in counts.

Raises:
ValueError, MissingNodeError, DuplicateNodeError

If validation fails. Exact error will depend on what was invalid.

Any Exception

If invalid method-specific parameters are provided.