skbio.embedding.ProteinEmbedding#
- class skbio.embedding.ProteinEmbedding(embedding, sequence, clip_head=False, clip_tail=False, **kwargs)[source]#
Embedding of a protein sequence.
- Parameters:
- embeddingarray_like
The embedding of the protein sequence. Row vectors correspond to the latent residues coordinates.
- sequencestr, Protein, or 1D ndarray
Characters representing the protein sequence itself.
- clip_headbool, optional
If
True
, then the first row of the embedding will be removed. Some language models specify start tokens, and this parameter can be used to account for this.- clip_tailbool, optional
If
True
, then the last row of the embedding will be removed. Some language models specify end tokens, and this parameter can be used to account for this.
See also
Examples
>>> from skbio.embedding import ProteinEmbedding >>> import numpy as np >>> embedding = np.random.rand(10, 3) >>> sequence = "ACDEFGHIKL" >>> ProteinEmbedding(embedding, sequence) ProteinEmbedding -------------------------- Stats: length: 10 embedding dimension: 3 has gaps: False has degenerates: False has definites: True has stops: False -------------------------- 0 ACDEFGHIKL
Attributes
Array containing underlying residue characters.
Attributes (inherited)
The embedding tensor.
IDs corresponding to each row of the embedding.
String representation of the underlying sequence.
Methods
read
(file[, format])Create a new
ProteinEmbedding
instance from a file.write
(file[, format])Write an instance of
ProteinEmbedding
to a file.Methods (inherited)
bytes
()Bytes representation of string encoding.
Special methods (inherited)
__eq__
(value, /)Return self==value.
__ge__
(value, /)Return self>=value.
__getstate__
(/)Helper for pickle.
__gt__
(value, /)Return self>value.
__hash__
(/)Return hash(self).
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(value, /)Return self!=value.
__str__
()String representation of the underlying sequence.
Details
- default_write_format = 'embed'#
- residues#
Array containing underlying residue characters.