skbio.sequence.SubstitutionMatrix.by_name#

classmethod SubstitutionMatrix.by_name(name)[source]#

Load a pre-defined substitution matrix by its name.

Parameters:
namestr

Name of the substitution matrix.

Returns:
SubstitutionMatrix

Named substitution matrix.

Raises:
ValueError

If named substitution matrix does not exist.

See also

get_names

Notes

Names are case-insensitive. For instance, BLOSUM62 and blosum62 point to the same substitution matrix.

Available substitution matrix names can be obtained by get_names. Currently, the following names are supported:

  • NUC.4.4 (a.k.a. DNAfull): A nucleotide substitution matrix covering all definite and degenerate nucleotides.

  • Point Accepted Mutation (PAM) [1]: A set of amino acid substitution matrices, including PAM30, PAM70 and PAM250.

  • BLOcks SUbstitution Matrix (BLOSUM) [2]: A set of amino acid substitution matrices, including BLOSUM45, BLOSUM50, BLOSUM62, BLOSUM80 and BLOSUM90.

References

[1]

Dayhoff, M., Schwartz, R., & Orcutt, B. (1978). A model of evolutionary change in proteins. Atlas of protein sequence and structure, 5, 345-352.

[2]

Henikoff, S., & Henikoff, J. G. (1992). Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences, 89(22), 10915-10919.

Examples

>>> from skbio import SubstitutionMatrix
>>> mat = SubstitutionMatrix.by_name('BLOSUM62')
>>> len(mat.alphabet)
24
>>> mat['M', 'K']
-1.0