skbio.alignment.AlignPath.to_indices#
- AlignPath.to_indices(gap=-1)[source]#
Generate an array of indices of characters in the original sequences.
- Parameters:
- gapint, np.nan, np.inf, “del”, or “mask”, optional
Method to encode gaps in the alignment. If numeric, replace gaps with this value. If “del”, delete columns that have any gap. If “mask”, return an
np.ma.MaskedArray, with gaps masked. Default is -1.
- Returns:
- ndarray of int of shape (n_sequences, n_positions)
Array of indices of characters in the original sequences.
See also
Notes
The transpose of the output matches the underlying data structure of Biotite’s
Alignmentclass [1]. Therefore, one can convert scikit-bio alignments into Biotite alignments, and vice versa.References
Examples
>>> from skbio.alignment import AlignPath >>> path = AlignPath(lengths=[2, 1, 2, 1], ... states=[0, 6, 0, 1], ... starts=[0, 1, 2]) >>> idx = path.to_indices() >>> idx array([[ 0, 1, 2, 3, 4, -1], [ 1, 2, -1, 3, 4, 5], [ 2, 3, -1, 4, 5, 6]])
One can create a Biotite
Alignmentobject from the transposed indices and the original sequences.>>> from biotite.sequence import NucleotideSequence >>> from biotite.sequence.align import Alignment >>> seqs = [NucleotideSequence("ACGTGA"), ... NucleotideSequence("TACTCA"), ... NucleotideSequence("GGACTGA")] >>> aln = Alignment(seqs, idx.T) >>> print(aln) ACGTG- AC-TCA AC-TGA