skbio.sequence.distance.hamming#

skbio.sequence.distance.hamming(seq1, seq2)[source]#

Compute Hamming distance between two sequences.

The Hamming distance between two equal-length sequences is the proportion of differing characters.

Parameters:

seq1, seq2Sequence: Sequences to compute Hamming distance between.

Returns:

float: Hamming distance between seq1 and seq2.

Raises:

TypeError: If seq1 and seq2 are not Sequence instances.
TypeError: If seq1 and seq2 are not the same type.
ValueError: If seq1 and seq2 are not the same length.

See also

scipy.spatial.distance.hamming

Notes

np.nan will be returned if the sequences do not contain any characters.

This function does not make assumptions about the sequence alphabet in use. Each sequence object’s underlying sequence of characters are used to compute Hamming distance. Characters that may be considered equivalent in certain contexts (e.g., - and . as gap characters) are treated as distinct characters when computing Hamming distance.

Examples

>>> from skbio import Sequence
>>> from skbio.sequence.distance import hamming
>>> seq1 = Sequence('AGGGTA')
>>> seq2 = Sequence('CGTTTA')
>>> hamming(seq1, seq2)
0.5