skbio.sequence.Sequence.kmer_frequencies#

Sequence.kmer_frequencies(k, overlap=True, relative=False)[source]#

Return counts of words of length k from this sequence.

Parameters:
kint

The word length.

overlapbool, optional

Defines whether the kmers should be overlapping or not.

relativebool, optional

If True, return the relative frequency of each kmer instead of its count.

Returns:
dict

Frequencies of words of length k contained in this sequence.

Raises:
ValueError

If k is less than 1.

Examples

>>> from skbio import Sequence
>>> s = Sequence('ACACATTTATTA')
>>> freqs = s.kmer_frequencies(3, overlap=False)
>>> freqs
{'ACA': 1, 'CAT': 1, 'TTA': 2}
>>> freqs = s.kmer_frequencies(3, relative=True, overlap=False)
>>> freqs
{'ACA': 0.25, 'CAT': 0.25, 'TTA': 0.5}