skbio.stats.distance.pwmantel#

skbio.stats.distance.pwmantel(dms, labels=None, method='pearson', permutations=999, alternative='two-sided', strict=True, lookup=None, seed=None)[source]#

Run Mantel tests for every pair of given distance matrices.

Runs a Mantel test for each pair of distance matrices and collates the results in a DataFrame. Distance matrices do not need to be in the same ID order if they are DistanceMatrix instances. Distance matrices will be re-ordered prior to running each pairwise test, and if strict=False, IDs that don’t match between a pair of distance matrices will be dropped prior to running the test (otherwise a ValueError will be raised if there are nonmatching IDs between any pair of distance matrices).

Parameters:

dmsiterable of DistanceMatrix objects, array_like objects, or filepaths: to distance matrices. If they are array_like, no reordering or matching of IDs will be performed.
labelsiterable of str or int, optional: Labels for each distance matrix in dms. These are used in the results DataFrame to identify the pair of distance matrices used in a pairwise Mantel test. If None, defaults to monotonically-increasing integers starting at zero.
method{‘pearson’, ‘spearman’}: Correlation method. See mantel function for more details.
permutationsint, optional: Number of permutations. See mantel function for more details.
alternative{‘two-sided’, ‘greater’, ‘less’}: Alternative hypothesis. See mantel function for more details.
strictbool, optional: Handling of nonmatching IDs. See mantel function for more details.
lookupdict, optional: Map existing IDs to new IDs. See mantel function for more details.
seedint, Generator or RandomState, optional: A user-provided random seed or random generator instance. See details.

Added in version 0.6.3.

Returns:

pandas.DataFrame: DataFrame containing the results of each pairwise test (one per row). Includes the number of objects considered in each test as column n (after applying lookup and filtering nonmatching IDs if strict=False). Column p-value will display p-values as NaN if p-values could not be computed (they are stored as np.nan within the DataFrame; see mantel for more details).

See also

mantel
DistanceMatrix.read

Notes

Passing a list of filepaths can be useful as it allows for a smaller amount of memory consumption as it only loads two matrices at a time as opposed to loading all distance matrices into memory.

Examples

Import the functionality we’ll use in the following examples:

>>> from skbio import DistanceMatrix
>>> from skbio.stats.distance import pwmantel

Define three 3x3 distance matrices:

>>> x = DistanceMatrix([[0, 1, 2],
...                     [1, 0, 3],
...                     [2, 3, 0]])
>>> y = DistanceMatrix([[0, 2, 7],
...                     [2, 0, 6],
...                     [7, 6, 0]])
>>> z = DistanceMatrix([[0, 5, 6],
...                     [5, 0, 1],
...                     [6, 1, 0]])

Run Mantel tests for each pair of distance matrices (there are 3 possible pairs):

>>> pwmantel((x, y, z), labels=('x', 'y', 'z'),
...          permutations=0)
             statistic p-value  n   method  permutations alternative
dm1 dm2
x   y     0.755929     NaN  3  pearson             0   two-sided
    z    -0.755929     NaN  3  pearson             0   two-sided
y   z    -0.142857     NaN  3  pearson             0   two-sided

Note that we passed permutations=0 to suppress significance tests; the p-values in the output are labelled NaN.