skbio.embedding.ProteinEmbedding#
- class skbio.embedding.ProteinEmbedding(embedding, sequence, clip_head=False, clip_tail=False, **kwargs)[source]#
Embedding of a protein sequence.
- Parameters:
- embeddingarray_like
The embedding of the protein sequence. Row vectors correspond to the latent residues coordinates.
- sequencestr, Protein, or 1D ndarray
Characters representing the protein sequence itself.
- clip_headbool, optional
If
True
, then the first row of the embedding will be removed. Some language models specify start tokens, and this parameter can be used to account for this.- clip_tailbool, optional
If
True
, then the last row of the embedding will be removed. Some language models specify end tokens, and this parameter can be used to account for this.
See also
Examples
>>> from skbio.embedding import ProteinEmbedding >>> import numpy as np >>> embedding = np.random.rand(10, 3) >>> sequence = "ACDEFGHIKL" >>> ProteinEmbedding(embedding, sequence) ProteinEmbedding -------------------------- Stats: length: 10 embedding dimension: 3 has gaps: False has degenerates: False has definites: True has stops: False -------------------------- 0 ACDEFGHIKL
Attributes
default_write_format
embedding
The embedding tensor.
ids
IDs corresponding to each row of the embedding.
residues
Array containing underlying residue characters.
sequence
String representation of the underlying sequence.
Built-ins
__eq__
(value, /)Return self==value.
__ge__
(value, /)Return self>=value.
__getstate__
(/)Helper for pickle.
__gt__
(value, /)Return self>value.
__hash__
(/)Return hash(self).
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(value, /)Return self!=value.
__str__
()String representation of the underlying sequence.
Methods