skbio.alignment.AlignPath.to_aligned#
- AlignPath.to_aligned(seqs, gap_char='-', flanking=None)[source]#
Extract aligned regions from original sequences.
Added in version 0.6.4.
- Parameters:
- seqsiterable of Sequence or str
Original sequences.
- gap_charstr, optional
Character to be placed in each gap position. Default is “-”. Set as “” to suppress gaps in the output.
- flankingint or (int, int), optional
Length of flanking regions in the original sequences to be included in the output. Can be two numbers (leading and trailing, respectively) or one number (same for leading and trailing). If the specified flanking region is longer than a sequence actually has, the remaining space will be filled with white spaces (” “).
- Returns:
- list of str
Aligned regions of the sequences.
- Raises:
- ValueError
If there are more sequences than in the path.
- ValueError
If any sequence is shorter than in the path.
Notes
This method provides a convenient way to process and display alignments, without invoking the explicit
TabularMSA
class. BothSequence
objects and plain strings are valid input sequences.However, it only outputs strings without retaining the
Sequence
objects and their metadata. For the later purpose, please useTabularMSA
’sfrom_path_seqs
method instead.Examples
>>> from skbio.sequence import DNA >>> from skbio.alignment import AlignPath >>> path = AlignPath( ... lengths=[2, 2, 2, 1, 1], ... states=[0, 2, 0, 6, 0], ... starts=[0, 3, 0], ... ) >>> seqs = [ ... DNA('CGTCGTGC'), ... DNA('ATTCAGTCGG'), ... DNA('CGTCGTTAA') ... ] >>> path.to_aligned(seqs) ['CGTCGTGC', 'CA--GT-C', 'CGTCGT-T']