skbio.sequence.NucleotideMixin.gc_frequency#
- NucleotideMixin.gc_frequency(relative=False)[source]#
Calculate frequency of G’s and C’s in the sequence.
This calculates the minimum GC frequency, which corresponds to IUPAC characters G, C, and S (which stands for G or C).
- Parameters:
- relativebool, optional
If False return the frequency of G, C, and S characters (ie the count). If True return the relative frequency, ie the proportion of G, C, and S characters in the sequence. In this case the sequence will also be degapped before the operation, so gap characters will not be included when calculating the length of the sequence.
- Returns:
- int or float
Either frequency (count) or relative frequency (proportion), depending on relative.
See also
Examples
>>> from skbio import DNA >>> DNA('ACGT').gc_frequency() 2 >>> DNA('ACGT').gc_frequency(relative=True) 0.5 >>> DNA('ACGT--..').gc_frequency(relative=True) 0.5 >>> DNA('--..').gc_frequency(relative=True) 0
S means G or C, so it counts:
>>> DNA('ASST').gc_frequency() 2
Other degenerates don’t count:
>>> DNA('RYKMBDHVN').gc_frequency() 0