skbio.sequence.distance.k2p#

skbio.sequence.distance.k2p(seq1, seq2)[source]#

Calculate the K2P distance between two aligned nucleotide sequences.

Added in version 0.7.2.

The Kimura 2-parameter (K2P, a.k.a. K80) model allows differential rates of transitions (substitutions between two purines or between two pyrimidines) versus transversions (substitutions between a purine and a pyrimidine), while assuming equal base frequencies. The distance is calculated as:

\[D = -\frac{1}{2} ln((1 - 2P - Q) \sqrt{1 - 2Q})\]

Where \(P\) and \(Q\) are the proportions of transitions and transversions, respectively.

Parameters:
seq1, seq2{DNA, RNA}

Sequences to compute the K2P distance between.

Returns:
float

K2P distance between the two sequences.

See also

jc69
f84

Notes

The Kimura 2-parameter model (K2P or K80) was originally described in [1].

K2P is an extension of the JC69 model by modeling differential transition and transversion rates. Meanwhile, K2P can be considered as a special case of the F84 model by assuming equal base frequencies.

This function returns NaN if either \(1 - 2P - Q\) or \(1 - 2Q\) is zero or negative, which implicates over-saturation of substitutions.

References

[1]

Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16(2), 111-120.