skbio.alignment.AlignPath.to_aligned#

AlignPath.to_aligned(seqs, gap_char='-', flanking=None)[source]#

Extract aligned regions from original sequences.

Added in version 0.6.4.

Parameters:
seqsiterable of Sequence or str

Original sequences.

gap_charstr, optional

Character to be placed in each gap position. Default is “-”. Set as “” to suppress gaps in the output.

flankingint or (int, int), optional

Length of flanking regions in the original sequences to be included in the output. Can be two numbers (leading and trailing, respectively) or one number (same for leading and trailing). If the specified flanking region is longer than a sequence actually has, the remaining space will be filled with white spaces (” “).

Returns:
list of str

Aligned regions of the sequences.

Raises:
ValueError

If there are more sequences than in the path.

ValueError

If any sequence is shorter than in the path.

Notes

This method provides a convenient way to process and display alignments, without invoking the explicit TabularMSA class. Both Sequence objects and plain strings are valid input sequences.

However, it only outputs strings without retaining the Sequence objects and their metadata. For the later purpose, please use TabularMSA’s from_path_seqs method instead.

Examples

>>> from skbio.sequence import DNA
>>> from skbio.alignment import AlignPath
>>> path = AlignPath(
...     lengths=[2, 2, 2, 1, 1],
...     states=[0, 2, 0, 6, 0],
...     starts=[0, 3, 0],
... )
>>> seqs = [
...    DNA('CGTCGTGC'),
...    DNA('ATTCAGTCGG'),
...    DNA('CGTCGTTAA')
... ]
>>> path.to_aligned(seqs)
['CGTCGTGC',
 'CA--GT-C',
 'CGTCGT-T']