skbio.stats.distance.DistanceMatrix#
- class skbio.stats.distance.DistanceMatrix(data, ids=None, validate=True)[source]#
Store distances between objects.
A DistanceMatrix is a DissimilarityMatrix with the additional requirement that the matrix data is symmetric. There are additional methods made available that take advantage of this symmetry. The
plotmethod provides convenient built-in plotting functionality.- Parameters:
- dataarray_like or DissimilarityMatrix
Square, hollow, two-dimensional
numpy.ndarrayof distances (floats), or a structure that can be converted to anumpy.ndarrayusingnumpy.asarrayor a one-dimensional vector of distances (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or DistanceMatrix) instance, in which case the instance’s data will be used. Data will be converted to a floatdtypeif necessary. A copy will not be made if already anumpy.ndarraywith a floatdtype.- idssequence of str, optional
Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...).- validatebool, optional
If validate is
True(the default) and data is not a DistanceMatrix object, the input data will be validated.
See also
Notes
The distances are stored in redundant (square-form) format [1]. To facilitate use with other scientific Python routines (e.g., scipy), the distances can be retrieved in condensed (vector-form) format using condensed_form.
DistanceMatrix only requires that the distances it stores are symmetric. Checks are not performed to ensure the other three metric properties hold (non-negativity, identity of indiscernibles, and triangle inequality) [2]. Thus, a DistanceMatrix instance can store distances that are not metric.
References
Attributes (inherited)
Transpose of the dissimilarity matrix.
Array of dissimilarities.
Data type of the dissimilarities.
Tuple of object IDs.
Two-element tuple containing the dissimilarity matrix dimensions.
Total number of elements in the dissimilarity matrix.
Methods
Return an array of distances in condensed format.
from_iterable(iterable, metric[, key, keys, ...])Create DistanceMatrix from all pairs in an iterable given a metric.
permute([condensed, seed])Randomly permute both rows and columns in the matrix.
Create a
pandas.Seriesfrom thisDistanceMatrix.Methods (inherited)
between(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs.
copy()Return a deep copy of the dissimilarity matrix.
filter(ids[, strict])Filter the dissimilarity matrix by IDs.
index(lookup_id)Return the index of the specified ID.
plot([cmap, title])Create a heatmap of the dissimilarity matrix.
read([format])Create a new
DistanceMatrixinstance from a file.Return an array of dissimilarities in redundant format.
rename(mapper[, strict])Rename IDs in the dissimilarity matrix.
Create a
pandas.DataFramefrom thisDissimilarityMatrix.Return the transpose of the dissimilarity matrix.
within(ids)Obtain all the distances among the set of IDs.
write(file[, format])Write an instance of
DistanceMatrixto a file.Special methods (inherited)
__contains__(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__(other)Compare this dissimilarity matrix to another for equality.
__ge__(value, /)Return self>=value.
__getitem__(index)Slice into dissimilarity data by object ID or numpy indexing.
__getstate__(/)Helper for pickle.
__gt__(value, /)Return self>value.
__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(other)Determine whether two dissimilarity matrices are not equal.
__str__()Return a string representation of the dissimilarity matrix.
Details