skbio.stats.evolve.hommola_cospeciation#
- skbio.stats.evolve.hommola_cospeciation(host_dist, par_dist, interaction, permutations=999, seed=None)[source]#
Perform Hommola et al (2009) host/parasite cospeciation test.
This test for host/parasite cospeciation is as described in [1]. This test is a modification of a Mantel test, expanded to accept the case where multiple hosts map to a single parasite (and vice versa).
For a basic Mantel test, the distance matrices being compared must have the same number of values. To determine the significance of the correlations between distances in the two matrices, the correlation coefficient of those distances is calculated and compared to the correlation coefficients calculated from a set of matrices in which rows and columns have been permuted.
In this test, rather than comparing host-host to parasite-parasite distances directly (requiring one host per parasite), the distances are compared for each interaction edge between host and parasite. Thus, a host interacting with two different parasites will be represented in two different edges, with the host-host distance for the comparison between those edges equal to zero, and the parasite-parasite distance equal to the distance between those two parasites. Like in the Mantel test, significance of the interaction is assessed by permutation, in this case permutation of the host-symbiont interaction links.
Note that the null hypothesis being tested here is that the hosts and parasites have evolved independently of one another. The alternative to this is a somewhat weaker case than what is often implied with the term ‘cospeciation,’ which is that each incidence of host speciation is recapitulated in an incidence of symbiont speciation (strict co-cladogenesis). Although there may be many factors that could contribute to non-independence of host and symbiont phylogenies, this loss of explanatory specificity comes with increased robustness to phylogenetic uncertainty. Thus, this test may be especially useful for cases where host and/or symbiont phylogenies are poorly resolved, or when simple correlation between host and symbiont evolution is of more interest than strict co-cladogenesis.
This test requires pairwise distance matrices for hosts and symbionts, as well as an interaction matrix specifying links between hosts (in columns) and symbionts (in rows). This interaction matrix should have the same number of columns as the host distance matrix, and the same number of rows as the symbiont distance matrix. Interactions between hosts and symbionts should be indicated by values of
1
orTrue
, with non-interactions indicated by values of0
orFalse
.- Parameters:
- host_dist2-D array_like or DistanceMatrix
Symmetric matrix of m x m pairwise distances between hosts.
- par_dist2-D array_like or DistanceMatrix
Symmetric matrix of n x n pairwise distances between parasites.
- interaction2-D array_like, bool
n x m binary matrix of parasite x host interactions. Order of hosts (columns) should be identical to order of hosts in host_dist, as should order of parasites (rows) be identical to order of parasites in par_dist.
- permutationsint, optional
Number of permutations used to compute p-value. Must be greater than or equal to zero. If zero, statistical significance calculations will be skipped and the p-value will be
np.nan
.- seedint or np.random.Generator, optional
A user-provided random seed or random generator instance. See
details
.Added in version 0.6.3.
- Returns:
- corr_coefffloat
Pearson correlation coefficient of host : parasite association.
- p_valuefloat
Significance of host : parasite association computed using permutations and a one-sided (greater) alternative hypothesis.
- perm_stats1-D numpy.ndarray, float
Correlation coefficients observed using permuted host : parasite interactions. Length will be equal to the number of permutations used to compute p-value (see permutations parameter above).
Notes
It is assumed that the ordering of parasites in par_dist and hosts in host_dist are identical to their ordering in the rows and columns, respectively, of the interaction matrix.
This code is loosely based on the original R code from [1].
References
Examples
>>> from skbio.stats.evolve import hommola_cospeciation
Create arrays for host distances, parasite distances, and their interactions (data taken from example in [1]):
>>> hdist = [[0,3,8,8,9], [3,0,7,7,8], [8,7,0,6,7], [8,7,6,0,3], ... [9,8,7,3,0]] >>> pdist = [[0,5,8,8,8], [5,0,7,7,7], [8,7,0,4,4], [8,7,4,0,2], ... [8,7,4,2,0]] >>> interaction = [[1,0,0,0,0], [0,1,0,0,0], [0,0,1,0,0], [0,0,0,1,0], ... [0,0,0,1,1]]
Run the cospeciation test with 99 permutations. Note that the correlation coefficient for the observed values counts against the final reported p-value:
>>> corr_coeff, p_value, perm_stats = hommola_cospeciation( ... hdist, pdist, interaction, permutations=99, seed=42) >>> print("%.3f" % corr_coeff) 0.832
In this case, the host distances have a fairly strong positive correlation with the symbiont distances. However, this may also reflect structure inherent in the phylogeny, and is not itself indicative of significance.
>>> p_value <= 0.05 True
After permuting host : parasite interactions, we find that the observed correlation is indeed greater than we would expect by chance.