skbio.stats.distance.DissimilarityMatrix#
- class skbio.stats.distance.DissimilarityMatrix(data, ids=None, validate=True)[source]#
Store dissimilarities between objects.
A DissimilarityMatrix instance stores a square, hollow, two-dimensional matrix of dissimilarities between objects. Objects could be, for example, samples or DNA sequences. A sequence of IDs accompanies the dissimilarities.
Methods are provided to load and save dissimilarity matrices from/to disk, as well as perform common operations such as extracting dissimilarities based on object ID.
- Parameters:
- dataarray_like or DissimilarityMatrix
Square, hollow, two-dimensional
numpy.ndarray
of dissimilarities (floats), or a structure that can be converted to anumpy.ndarray
usingnumpy.asarray
or a one-dimensional vector of dissimilarities (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or subclass) instance, in which case the instance’s data will be used. Data will be converted to a floatdtype
if necessary. A copy will not be made if already anumpy.ndarray
with a floatdtype
.- idssequence of str, optional
Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None
(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...)
.- validatebool, optional
If validate is
True
(the default) and data is not a DissimilarityMatrix object, the input data will be validated.
Notes
The dissimilarities are stored in redundant (square-form) format [1].
The data are not checked for symmetry, nor guaranteed/assumed to be symmetric.
References
Attributes
T
Transpose of the dissimilarity matrix.
data
Array of dissimilarities.
default_write_format
dtype
Data type of the dissimilarities.
ids
Tuple of object IDs.
png
Get figure data in PNG format.
shape
Two-element tuple containing the dissimilarity matrix dimensions.
size
Total number of elements in the dissimilarity matrix.
svg
Get figure data in SVG format.
Built-ins
__contains__
(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__
(other)Compare this dissimilarity matrix to another for equality.
__ge__
(value, /)Return self>=value.
__getitem__
(index)Slice into dissimilarity data by object ID or numpy indexing.
__getstate__
(/)Helper for pickle.
__gt__
(value, /)Return self>value.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(other)Determine whether two dissimilarity matrices are not equal.
__str__
()Return a string representation of the dissimilarity matrix.
Methods
between
(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs.
copy
()Return a deep copy of the dissimilarity matrix.
filter
(ids[, strict])Filter the dissimilarity matrix by IDs.
from_iterable
(iterable, metric[, key, keys])Create DissimilarityMatrix from an iterable given a metric.
index
(lookup_id)Return the index of the specified ID.
plot
([cmap, title])Create a heatmap of the dissimilarity matrix.
read
(file[, format])Create a new
DissimilarityMatrix
instance from a file.Return an array of dissimilarities in redundant format.
rename
(mapper[, strict])Rename IDs in the dissimilarity matrix.
Create a
pandas.DataFrame
from thisDissimilarityMatrix
.Return the transpose of the dissimilarity matrix.
within
(ids)Obtain all the distances among the set of IDs.
write
(file[, format])Write an instance of
DissimilarityMatrix
to a file.