skbio.alignment.local_pairwise_align#
- skbio.alignment.local_pairwise_align(seq1, seq2, gap_open_penalty, gap_extend_penalty, substitution_matrix)[source]#
Locally align exactly two seqs with Smith-Waterman.
- Parameters:
- seq1GrammaredSequence
The first unaligned sequence.
- seq2GrammaredSequence
The second unaligned sequence.
- gap_open_penaltyint or float
Penalty for opening a gap (this is substracted from previous best alignment score, so is typically positive).
- gap_extend_penaltyint or float
Penalty for extending a gap (this is substracted from previous best alignment score, so is typically positive).
- substitution_matrix: 2D dict (or similar)
Lookup for substitution scores (these values are added to the previous best alignment score).
- Returns:
- tuple
TabularMSA
object containing the aligned sequences, alignment score (float), and start/end positions of each input sequence (iterable of two-item tuples). Note that start/end positions are indexes into the unaligned sequences.
See also
Notes
This algorithm was originally described in [1]. The scikit-bio implementation was validated against the EMBOSS water web server [2].
References
[1]Identification of common molecular subsequences. Smith TF, Waterman MS. J Mol Biol. 1981 Mar 25;147(1):195-7.