skbio.alignment.TabularMSA.extend#
- TabularMSA.extend(sequences, minter=None, index=None, reset_index=False)[source]#
Extend this MSA with sequences without recomputing alignment.
- Parameters:
- sequencesiterable of GrammaredSequence
Sequences to be appended. Must match the dtype of the MSA and the number of positions in the MSA.
- mintercallable or metadata key, optional
Used to create index labels for the sequences being appended. If callable, it generates a label directly. Otherwise it’s treated as a key into the sequence metadata. Note that minter cannot be combined with index nor reset_index.
- indexpd.Index consumable, optional
Index labels to use for the appended sequences. Must be the same length as sequences. Must be able to be passed directly to
pd.Index
constructor. Note that index cannot be combined with minter nor reset_index.- reset_indexbool, optional
If
True
, this MSA’s index is reset to theTabularMSA
constructor’s default after extending. Note that reset_index cannot be combined with minter nor index.
- Raises:
- ValueError
If exactly one choice of minter, index, or reset_index is not provided.
- ValueError
If index is not the same length as sequences.
- TypeError
If sequences contains an object that isn’t a
GrammaredSequence
.- TypeError
If sequences contains a type that does not match the dtype of the MSA.
- ValueError
If the length of a sequence does not match the number of positions in the MSA.
See also
Notes
The MSA is not automatically re-aligned when appending sequences. Therefore, this operation is not necessarily meaningful on its own.
Examples
Create an MSA with a single sequence labeled
'seq1'
:>>> from skbio import DNA, TabularMSA >>> msa = TabularMSA([DNA('ACGT')], index=['seq1']) >>> msa TabularMSA[DNA] --------------------- Stats: sequence count: 1 position count: 4 --------------------- ACGT >>> msa.index Index(['seq1'], dtype='object')
Extend the MSA with sequences, providing their index labels via index:
>>> msa.extend([DNA('AG-T'), DNA('-G-T')], index=['seq2', 'seq3']) >>> msa TabularMSA[DNA] --------------------- Stats: sequence count: 3 position count: 4 --------------------- ACGT AG-T -G-T >>> msa.index Index(['seq1', 'seq2', 'seq3'], dtype='object')
Extend with more sequences, this time resetting the MSA’s index labels to the default with reset_index. Note that since the MSA’s index is reset, we do not need to provide index labels for the new sequences via index or minter:
>>> msa.extend([DNA('ACGA'), DNA('AC-T'), DNA('----')], ... reset_index=True) >>> msa TabularMSA[DNA] --------------------- Stats: sequence count: 6 position count: 4 --------------------- ACGT AG-T ... AC-T ---- >>> msa.index RangeIndex(start=0, stop=6, step=1)