skbio.stats.distance.pwmantel#

skbio.stats.distance.pwmantel(dms, labels=None, method='pearson', permutations=999, alternative='two-sided', strict=True, lookup=None, seed=None)[source]#

Run Mantel tests for every pair of given distance matrices.

Runs a Mantel test for each pair of distance matrices and collates the results in a DataFrame. Distance matrices do not need to be in the same ID order if they are DistanceMatrix instances. Distance matrices will be re-ordered prior to running each pairwise test, and if strict=False, IDs that don’t match between a pair of distance matrices will be dropped prior to running the test (otherwise a ValueError will be raised if there are nonmatching IDs between any pair of distance matrices).

Parameters:
dmsiterable of DistanceMatrix objects, array_like objects, or filepaths

to distance matrices. If they are array_like, no reordering or matching of IDs will be performed.

labelsiterable of str or int, optional

Labels for each distance matrix in dms. These are used in the results DataFrame to identify the pair of distance matrices used in a pairwise Mantel test. If None, defaults to monotonically-increasing integers starting at zero.

method{‘pearson’, ‘spearman’}

Correlation method. See mantel function for more details.

permutationsint, optional

Number of permutations. See mantel function for more details.

alternative{‘two-sided’, ‘greater’, ‘less’}

Alternative hypothesis. See mantel function for more details.

strictbool, optional

Handling of nonmatching IDs. See mantel function for more details.

lookupdict, optional

Map existing IDs to new IDs. See mantel function for more details.

seedint, Generator or RandomState, optional

A user-provided random seed or random generator instance. See details.

Added in version 0.6.3.

Returns:
pandas.DataFrame

DataFrame containing the results of each pairwise test (one per row). Includes the number of objects considered in each test as column n (after applying lookup and filtering nonmatching IDs if strict=False). Column p-value will display p-values as NaN if p-values could not be computed (they are stored as np.nan within the DataFrame; see mantel for more details).

Notes

Passing a list of filepaths can be useful as it allows for a smaller amount of memory consumption as it only loads two matrices at a time as opposed to loading all distance matrices into memory.

Examples

Import the functionality we’ll use in the following examples:

>>> from skbio import DistanceMatrix
>>> from skbio.stats.distance import pwmantel

Define three 3x3 distance matrices:

>>> x = DistanceMatrix([[0, 1, 2],
...                     [1, 0, 3],
...                     [2, 3, 0]])
>>> y = DistanceMatrix([[0, 2, 7],
...                     [2, 0, 6],
...                     [7, 6, 0]])
>>> z = DistanceMatrix([[0, 5, 6],
...                     [5, 0, 1],
...                     [6, 1, 0]])

Run Mantel tests for each pair of distance matrices (there are 3 possible pairs):

>>> pwmantel((x, y, z), labels=('x', 'y', 'z'),
...          permutations=0) 
             statistic p-value  n   method  permutations alternative
dm1 dm2
x   y     0.755929     NaN  3  pearson             0   two-sided
    z    -0.755929     NaN  3  pearson             0   two-sided
y   z    -0.142857     NaN  3  pearson             0   two-sided

Note that we passed permutations=0 to suppress significance tests; the p-values in the output are labelled NaN.