skbio.stats.distance.DistanceMatrix#

class skbio.stats.distance.DistanceMatrix(data, ids=None, validate=True, condensed=False)[source]#

Store distances between objects.

A DistanceMatrix is a SymmetricMatrix with the additional requirement that the matrix data is hollow (i.e., diagonal is zero). There are additional methods made available that take advantage of this hollowness.

Parameters:
data1-D or 2-D array_like, or PairwiseMatrix

A square 2-D array of pairwise distances between objects, or a 1-D array representing its condensed form. Can instead be an instance of PairwiseMatrix or its subclass, in which case its data and IDs will be directly used.

idssequence of str, optional

IDs of the objects. Must match the number of rows/columns in data. If None (default) and data does not contain IDs, IDs will be monotonically-increasing integers cast as strings, starting from zero (i.e., ‘0’, ‘1’, ‘2’, ‘3’, …).

validatebool, optional

If True (default) and data is not a DistanceMatrix object, the input data will be validated.

condensedbool, optional

Store the data in a 2-D redundant form (False, default) or a 1-D condensed form (True).

diagonal1-D array_like or float, optional

Values along the diagonal of the matrix. Must be zero. This parameter is a placeholder for interface compatibility with SymmetricMatrix.

Notes

The distances are stored in redundant (square-form) format [1]. To facilitate use with other scientific Python routines (e.g., scipy), the distances can be retrieved in condensed (vector-form) format using condensed_form.

DistanceMatrix only requires that the distances it stores are symmetric and hollow. Checks are not performed to ensure the other three metric properties hold (non-negativity, identity of indiscernibles, and triangle inequality) [2]. Thus, a DistanceMatrix instance can store distances that are not metric.

References

Attributes (inherited)

T

Transpose of the matrix.

data

Array of pairwise relationships.

default_write_format

diagonal

Diagonal value(s) of the matrix.

dtype

Data type of the matrix values.

ids

Tuple of object IDs.

shape

Two-element tuple containing the redundant form matrix dimensions.

size

Total number of elements in the underlying data structure.

Methods

to_series()

Create a pandas Series from this DistanceMatrix.

Methods (inherited)

as_condensed()

Return a condensed form deep copy of the matrix.

as_redundant()

Return a redundant form deep copy of the matrix.

between(from_, to_[, allow_overlap])

Obtain the pairwise values between the two groups of IDs.

condensed_form()

Return an array of distances in condensed format.

copy()

Return a deep copy of the symmetric matrix.

filter(ids[, strict])

Filter the matrix by IDs.

from_iterable(iterable, metric[, key, keys, ...])

Create a symmetric matrix from an iterable given a metric.

index(lookup_id)

Return the index of the specified ID.

permute([condensed, seed])

Randomly permute both rows and columns in the matrix.

plot([cmap, title])

Create a heatmap of the matrix.

read([format])

Create a new DistanceMatrix instance from a file.

redundant_form()

Return an array of values in redundant format.

rename(mapper[, strict])

Rename IDs in the matrix.

to_data_frame()

Create a pandas DataFrame from this matrix.

transpose()

Return the transpose of the matrix.

within(ids)

Obtain all the pairwise values among the set of IDs.

write(file[, format])

Write an instance of DistanceMatrix to a file.

Special methods (inherited)

__contains__(lookup_id)

Check if the specified ID is in the matrix.

__eq__(other)

Compare this matrix to another for equality.

__ge__(value, /)

Return self>=value.

__getitem__(index)

Slice into data by object ID or NumPy indexing.

__getstate__(/)

Helper for pickle.

__gt__(value, /)

Return self>value.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(other)

Determine whether two matrices are not equal.

__str__()

Return a string representation of the matrix.

Details