skbio.stats.distance.DistanceMatrix#

class skbio.stats.distance.DistanceMatrix(data, ids=None, validate=True, condensed=False)[source]#

Store distances between objects.

A DistanceMatrix is a SymmetricMatrix with the additional requirement that the matrix data is hollow (i.e., diagonal is zero). There are additional methods made available that take advantage of this hollowness.

Parameters:
data1-D or 2-D array_like, or PairwiseMatrix

A square 2-D array of pairwise distances between objects, or a 1-D array representing its condensed form. Can instead be an instance of PairwiseMatrix or its subclass, in which case its data and IDs will be directly used.

idssequence of str, optional

IDs of the objects. Must match the number of rows/columns in data. If None (default) and data does not contain IDs, IDs will be monotonically-increasing integers cast as strings, starting from zero (i.e., ‘0’, ‘1’, ‘2’, ‘3’, …).

validatebool, optional

If True (default) and data is not a DistanceMatrix object, the input data will be validated.

condensedbool, optional

Store the data in a 2-D redundant form (False, default) or a 1-D condensed form (True).

diagonal1-D array_like or float, optional

Values along the diagonal of the matrix. Must be zero. This parameter is a placeholder for interface compatibility with SymmetricMatrix.

Notes

The distances are stored in redundant (square-form) format [1]. To facilitate use with other scientific Python routines (e.g., scipy), the distances can be retrieved in condensed (vector-form) format using condensed_form.

DistanceMatrix only requires that the distances it stores are symmetric and hollow. Checks are not performed to ensure the other three metric properties hold (non-negativity, identity of indiscernibles, and triangle inequality) [2]. Thus, a DistanceMatrix instance can store distances that are not metric.

References

Attributes (inherited)

T

Transpose of the matrix.

data

Array of pairwise relationships.

default_write_format

Default write format for this object: lsmat.

diagonal

Diagonal value(s) of the matrix.

dtype

Data type of the matrix values.

ids

Tuple of object IDs.

shape

Two-element tuple containing the redundant form matrix dimensions.

size

Total number of elements in the underlying data structure.

Methods

read

Create a new DistanceMatrix instance from a file.

to_series

Create a pandas Series from this DistanceMatrix.

write

Write an instance of DistanceMatrix to a file.

Methods (inherited)

as_condensed

Return a condensed form deep copy of the matrix.

as_redundant

Return a redundant form deep copy of the matrix.

between

Obtain the pairwise values between the two groups of IDs.

condensed_form

Return an array of distances in condensed format.

copy

Return a deep copy of the symmetric matrix.

filter

Filter the matrix by IDs.

from_iterable

Create a symmetric matrix from an iterable given a metric.

index

Return the index of the specified ID.

permute

Randomly permute both rows and columns in the matrix.

plot

Create a heatmap of the matrix.

redundant_form

Return an array of values in redundant format.

rename

Rename IDs in the matrix.

to_data_frame

Create a pandas DataFrame from this matrix.

transpose

Return the transpose of the matrix.

within

Obtain all the pairwise values among the set of IDs.

Special methods (inherited)

__contains__

Check if the specified ID is in the matrix.

__eq__

Compare this matrix to another for equality.

__ge__

Return self>=value.

__getitem__

Slice into data by object ID or NumPy indexing.

__getstate__

Helper for pickle.

__gt__

Return self>value.

__le__

Return self<=value.

__lt__

Return self<value.

__ne__

Determine whether two matrices are not equal.

__str__

Return a string representation of the matrix.

Details