skbio.sequence.distance.jc69#
- skbio.sequence.distance.jc69(seq1, seq2)[source]#
Calculate the JC69 distance between two aligned nucleotide sequences.
Added in version 0.7.2.
The Jukes-Cantor 1969 (JC69) model estimates the evolutionary distance (number of substitutions per site) between two nucleotide sequences by correcting the observed proportion of differing sites (i.e., p-distance) to account for multiple putative substitutions at the same site (i.e., saturation). It is calculated as:
\[D = -\frac{3}{4} ln(1 - \frac{4}{3} p)\]- Parameters:
- seq1, seq2{DNA, RNA}
Sequences to compute the JC69 distance between.
- Returns:
- float
JC69 distance between the two sequences.
Notes
The Jukes-Cantor 1969 (JC69) model was originally described in [1].
JC69 is a basic evolutionary model for nucleotide sequences. It assumes equal base frequencies and equal substitution rates between bases. It models sequence evolution as a continuous-time Markov chain, and corrects the observed distance (p-distance) for repeated substitutions to estimate the true distance.
This function returns NaN if \(p \geq 0.75\). This happens when the two sequences are too divergent and substitutions are over-saturated for reliable estimation of the evolutionary distance.
References
[1]Jukes, T. H., & Cantor, C. R. (1969). Evolution of protein molecules. Mammalian Protein Metabolism, 3(21), 132.