skbio.sequence.distance.k2p#
- skbio.sequence.distance.k2p(seq1, seq2, gamma=None)[source]#
Calculate the K2P distance between two aligned nucleotide sequences.
Added in version 0.7.2.
The Kimura 2-parameter (K2P, a.k.a. K80) model allows differential rates of transitions (substitutions between two purines or between two pyrimidines) versus transversions (substitutions between a purine and a pyrimidine), while assuming equal base frequencies. The distance is calculated as:
\[D = -\frac{1}{2} ln\left((1 - 2P - Q) \sqrt{1 - 2Q}\right)\]Where \(P\) and \(Q\) are the proportions of transitions and transversions, respectively.
The K2P model can be corrected for site-rate heterogeneity by assuming that evolutionary rates follow a gamma distribution:
\[D = \frac{\alpha}{2} \left[\left(1 - 2P - Q\right)^{-\frac{1}{\alpha}} + \frac{1}{2}\left(1 - 2Q\right)^{-\frac{1}{\alpha}} - \frac{3}{2}\right]\]Where \(\alpha > 0\) is the shape parameter of the gamma distribution.
- Parameters:
- seq1, seq2{DNA, RNA}
Sequences to compute the K2P distance between.
- gammafloat, optional
Shape parameter (\(\alpha\)) of the gamma distribution for among-site rate heterogeneity. Must be a positive number. If not provided, no gamma correction will be applied.
Added in version 0.7.3.
- Returns:
- float
K2P distance between the two sequences.
Notes
The Kimura 2-parameter model (K2P or K80) was originally described in [1] and its gamma correction in [2].
K2P is an extension of the JC69 model by modeling differential transition and transversion rates. Meanwhile, K2P can be considered as a special case of the F84 model by assuming equal base frequencies.
This function returns NaN if either \(1 - 2P - Q\) or \(1 - 2Q\) is zero or negative, which implicates over-saturation of substitutions.
References
[1]Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16(2), 111-120.
[2]Jin, L., & Nei, M. (1990). Limitations of the evolutionary parsimony method of phylogenetic analysis. Molecular biology and evolution, 7(1), 82-102.