skbio.stats.distance.DistanceMatrix#
- class skbio.stats.distance.DistanceMatrix(data, ids=None, validate=True)[source]#
Store distances between objects.
A DistanceMatrix is a DissimilarityMatrix with the additional requirement that the matrix data is symmetric. There are additional methods made available that take advantage of this symmetry.
- Parameters:
- dataarray_like or DissimilarityMatrix
Square, hollow, two-dimensional
numpy.ndarray
of distances (floats), or a structure that can be converted to anumpy.ndarray
usingnumpy.asarray
or a one-dimensional vector of distances (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or DistanceMatrix) instance, in which case the instance’s data will be used. Data will be converted to a floatdtype
if necessary. A copy will not be made if already anumpy.ndarray
with a floatdtype
.- idssequence of str, optional
Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None
(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...)
.- validatebool, optional
If validate is
True
(the default) and data is not a DistanceMatrix object, the input data will be validated.
See also
Notes
The distances are stored in redundant (square-form) format [1]. To facilitate use with other scientific Python routines (e.g., scipy), the distances can be retrieved in condensed (vector-form) format using condensed_form.
DistanceMatrix only requires that the distances it stores are symmetric. Checks are not performed to ensure the other three metric properties hold (non-negativity, identity of indiscernibles, and triangle inequality) [2]. Thus, a DistanceMatrix instance can store distances that are not metric.
References
Attributes (inherited)
Transpose of the dissimilarity matrix.
Array of dissimilarities.
Data type of the dissimilarities.
Tuple of object IDs.
Get figure data in PNG format.
Two-element tuple containing the dissimilarity matrix dimensions.
Total number of elements in the dissimilarity matrix.
Get figure data in SVG format.
Methods
Return an array of distances in condensed format.
from_iterable
(iterable, metric[, key, keys, ...])Create DistanceMatrix from all pairs in an iterable given a metric.
permute
([condensed, seed])Randomly permute both rows and columns in the matrix.
Create a
pandas.Series
from thisDistanceMatrix
.Methods (inherited)
between
(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs.
copy
()Return a deep copy of the dissimilarity matrix.
filter
(ids[, strict])Filter the dissimilarity matrix by IDs.
index
(lookup_id)Return the index of the specified ID.
plot
([cmap, title])Create a heatmap of the dissimilarity matrix.
read
([format])Create a new
DistanceMatrix
instance from a file.Return an array of dissimilarities in redundant format.
rename
(mapper[, strict])Rename IDs in the dissimilarity matrix.
Create a
pandas.DataFrame
from thisDissimilarityMatrix
.Return the transpose of the dissimilarity matrix.
within
(ids)Obtain all the distances among the set of IDs.
write
(file[, format])Write an instance of
DistanceMatrix
to a file.Special methods (inherited)
__contains__
(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__
(other)Compare this dissimilarity matrix to another for equality.
__ge__
(value, /)Return self>=value.
__getitem__
(index)Slice into dissimilarity data by object ID or numpy indexing.
__getstate__
(/)Helper for pickle.
__gt__
(value, /)Return self>value.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(other)Determine whether two dissimilarity matrices are not equal.
__str__
()Return a string representation of the dissimilarity matrix.
Details