skbio.sequence.Protein.find_motifs#
- Protein.find_motifs(motif_type, min_length=1, ignore=None)[source]#
Search the biological sequence for motifs.
Options for motif_type:
- ‘N-glycosylation’
Identify N-glycosylation runs.
- Parameters:
- motif_typestr
Type of motif to find.
- min_lengthint, optional
Only motifs at least as long as min_length will be returned.
- ignore1D array_like (bool), optional
Boolean vector indicating positions to ignore when matching.
- Yields:
- slice
Location of the motif in the biological sequence.
- Raises:
- ValueError
If an unknown motif_type is specified.
Examples
>>> from skbio import DNA >>> s = DNA('ACGGGGAGGCGGAG') >>> for motif_slice in s.find_motifs('purine-run', min_length=2): ... motif_slice ... str(s[motif_slice]) slice(2, 9, None) 'GGGGAGG' slice(10, 14, None) 'GGAG'
Gap characters can disrupt motifs:
>>> s = DNA('GG-GG') >>> for motif_slice in s.find_motifs('purine-run'): ... motif_slice slice(0, 2, None) slice(3, 5, None)
Gaps can be ignored by passing the gap boolean vector to ignore:
>>> s = DNA('GG-GG') >>> for motif_slice in s.find_motifs('purine-run', ignore=s.gaps()): ... motif_slice slice(0, 5, None)