skbio.sequence.distance.kmer_distance#
- skbio.sequence.distance.kmer_distance(seq1, seq2, k, overlap=True)[source]#
 Compute the kmer distance between a pair of sequences.
The kmer distance between two sequences is the fraction of kmers that are unique to either sequence.
- Parameters:
 - seq1, seq2Sequence
 Sequences to compute kmer distance between.
- kint
 The kmer length.
- overlapbool, optional
 Defines whether the kmers should be overlapping or not.
- Returns:
 - float
 kmer distance between seq1 and seq2.
- Raises:
 - ValueError
 If k is less than 1.
- TypeError
 If seq1 and seq2 are not
Sequenceinstances.- TypeError
 If seq1 and seq2 are not the same type.
Notes
kmer counts are not incorporated in this distance metric.
np.nanwill be returned if there are no kmers defined for the sequences.Examples
>>> from skbio import Sequence >>> seq1 = Sequence('ATCGGCGAT') >>> seq2 = Sequence('GCAGATGTG') >>> kmer_distance(seq1, seq2, 3) 0.9230769230...