Sequence distance metrics (skbio.sequence.distance)#

This module provides functions for computing distances between biological sequences. These functions can be used directly on two Sequence objects, or supplied to other parts of the scikit-bio API that accept a sequence distance metric as input, such as align_dists, Sequence.distance, and PairwiseMatrix.from_iterable.

Generic distance metrics#

hamming(seq1, seq2[, proportion])

Compute the Hamming distance between two sequences.

pdist(seq1, seq2)

Calculate the p-distance between two aligned sequences.

logdet(seq1, seq2[, pseudocount])

Calculate the LogDet distance between two aligned sequences.

paralin(seq1, seq2[, pseudocount])

Calculate paralinear distance between two aligned sequences.

kmer_distance(seq1, seq2, k[, overlap])

Compute the k-mer distance between a pair of sequences.

Nucleotide distance metrics#

jc69(seq1, seq2)

Calculate the JC69 distance between two aligned nucleotide sequences.

f81(seq1, seq2[, freqs])

Calculate the F81 distance between two aligned nucleotide sequences.

k2p(seq1, seq2)

Calculate the K2P distance between two aligned nucleotide sequences.

f84(seq1, seq2[, freqs])

Calculate the F84 distance between two aligned nucleotide sequences.

tn93(seq1, seq2[, freqs])

Calculate the TN93 distance between two aligned nucleotide sequences.