skbio.alignment.AlignPath.to_indices#
- AlignPath.to_indices(gap=-1)[source]#
Generate an array of indices of characters in the original sequences.
- Parameters:
- gapint, np.nan, np.inf, “del”, or “mask”, optional
Method to encode gaps in the alignment. If numeric, replace gaps with this value. If “del”, delete columns that have any gap. If “mask”, return an
np.ma.MaskedArray
, with gaps masked. Default is -1.
- Returns:
- ndarray of int of shape (n_sequences, n_positions)
Array of indices of characters in the original sequences.
See also
Notes
The transpose of the output matches the underlying data structure of Biotite’s
Alignment
class [1]. Therefore, one can convert scikit-bio alignments into Biotite alignments, and vice versa.References
Examples
>>> from skbio.alignment import AlignPath >>> path = AlignPath(lengths=[2, 1, 2, 1], ... states=[0, 6, 0, 1], ... starts=[0, 1, 2]) >>> idx = path.to_indices() >>> idx array([[ 0, 1, 2, 3, 4, -1], [ 1, 2, -1, 3, 4, 5], [ 2, 3, -1, 4, 5, 6]])
One can create a Biotite
Alignment
object from the transposed indices and the original sequences.>>> from biotite.sequence import NucleotideSequence >>> from biotite.sequence.align import Alignment >>> seqs = [NucleotideSequence("ACGTGA"), ... NucleotideSequence("TACTCA"), ... NucleotideSequence("GGACTGA")] >>> aln = Alignment(seqs, idx.T) >>> print(aln) ACGTG- AC-TCA AC-TGA