skbio.alignment.AlignPath.from_indices#
- classmethod AlignPath.from_indices(indices, gap=-1)[source]#
Create an alignment path from character indices in the original sequences.
- Parameters:
- indicesarray_like of int of shape (n_sequences, n_positions)
Each element in the array is the index in the corresponding sequence.
- gapint or “mask”, optional
The value which represents a gap in the alignment. Defaults to -1, but can be other values. If “mask”,
indices
must be annp.ma.MaskedArray
. Cannot use “del”.
- Returns:
- AlignPath
The alignment path created from the given indices.
See also
Notes
If a sequence in the alignment consists of entirely gap characters, its start position will be equal to the gap character.
The input is equivalent to the transpose of the underlying data structure of Biotite’s
Alignment
class [1].References
Examples
>>> import numpy as np >>> from skbio.alignment import AlignPath >>> idx = np.array([[0, -1, -1, 1, 2, 3], ... [0, 1, 2, -1, -1, -1], ... [0, -1, -1, 1, 2, -1]]) >>> path = AlignPath.from_indices(idx) >>> path <AlignPath, sequences: 3, positions: 6, segments: 4>
One can convert a Biotite’s
Alignment
object into a scikit-bio alignment path using this method. For example:>>> from biotite.sequence import NucleotideSequence >>> from biotite.sequence.align import SubstitutionMatrix >>> from biotite.sequence.align import align_optimal >>> submat = SubstitutionMatrix.std_nucleotide_matrix() >>> seq1 = NucleotideSequence("GATCGTC") >>> seq2 = NucleotideSequence("ATCGCTC") >>> res = align_optimal(seq1, seq2, submat) >>> print(res[0]) GATCG-TC -ATCGCTC
>>> trace = res[0].trace >>> trace array([[ 0, -1], [ 1, 0], [ 2, 1], [ 3, 2], [ 4, 3], [-1, 4], [ 5, 5], [ 6, 6]])
>>> from skbio.alignment import PairAlignPath >>> path = PairAlignPath.from_indices(trace.T) >>> path <PairAlignPath, positions: 8, segments: 4, CIGAR: '1D4M1I2M'>