skbio.stats.distance.DistanceMatrix#
- class skbio.stats.distance.DistanceMatrix(data, ids=None, validate=True)[source]#
Store distances between objects.
A DistanceMatrix is a DissimilarityMatrix with the additional requirement that the matrix data is symmetric. There are additional methods made available that take advantage of this symmetry.
- Parameters:
- dataarray_like or DissimilarityMatrix
Square, hollow, two-dimensional
numpy.ndarray
of distances (floats), or a structure that can be converted to anumpy.ndarray
usingnumpy.asarray
or a one-dimensional vector of distances (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or DistanceMatrix) instance, in which case the instance’s data will be used. Data will be converted to a floatdtype
if necessary. A copy will not be made if already anumpy.ndarray
with a floatdtype
.- idssequence of str, optional
Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None
(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...)
.- validatebool, optional
If validate is
True
(the default) and data is not a DistanceMatrix object, the input data will be validated.
See also
Notes
The distances are stored in redundant (square-form) format [1]. To facilitate use with other scientific Python routines (e.g., scipy), the distances can be retrieved in condensed (vector-form) format using condensed_form.
DistanceMatrix only requires that the distances it stores are symmetric. Checks are not performed to ensure the other three metric properties hold (non-negativity, identity of indiscernibles, and triangle inequality) [2]. Thus, a DistanceMatrix instance can store distances that are not metric.
References
Attributes (inherited)
Transpose of the dissimilarity matrix.
Array of dissimilarities.
Data type of the dissimilarities.
Tuple of object IDs.
Two-element tuple containing the dissimilarity matrix dimensions.
Total number of elements in the dissimilarity matrix.
Methods
Return an array of distances in condensed format.
from_iterable
(iterable, metric[, key, keys, ...])Create DistanceMatrix from all pairs in an iterable given a metric.
permute
([condensed, seed])Randomly permute both rows and columns in the matrix.
Create a
pandas.Series
from thisDistanceMatrix
.Methods (inherited)
between
(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs.
copy
()Return a deep copy of the dissimilarity matrix.
filter
(ids[, strict])Filter the dissimilarity matrix by IDs.
index
(lookup_id)Return the index of the specified ID.
plot
([cmap, title])Create a heatmap of the dissimilarity matrix.
read
([format])Create a new
DistanceMatrix
instance from a file.Return an array of dissimilarities in redundant format.
rename
(mapper[, strict])Rename IDs in the dissimilarity matrix.
Create a
pandas.DataFrame
from thisDissimilarityMatrix
.Return the transpose of the dissimilarity matrix.
within
(ids)Obtain all the distances among the set of IDs.
write
(file[, format])Write an instance of
DistanceMatrix
to a file.Special methods (inherited)
__contains__
(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__
(other)Compare this dissimilarity matrix to another for equality.
__ge__
(value, /)Return self>=value.
__getitem__
(index)Slice into dissimilarity data by object ID or numpy indexing.
__getstate__
(/)Helper for pickle.
__gt__
(value, /)Return self>value.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(other)Determine whether two dissimilarity matrices are not equal.
__str__
()Return a string representation of the dissimilarity matrix.
Details