skbio.diversity.alpha.ace#
- skbio.diversity.alpha.ace(counts, rare_threshold=10)[source]#
Calculate the ACE metric (Abundance-based Coverage Estimator).
The ACE metric is defined as:
\[S_{ace}=S_{abund}+\frac{S_{rare}}{C_{ace}}+ \frac{F_1}{C_{ace}}\gamma^2_{ace}\]where \(S_{abund}\) is the number of abundant taxa (with more than
rare_threshold
individuals) when all samples are pooled, \(S_{rare}\) is the number of rare taxa (with less than or equal torare_threshold
individuals) when all samples are pooled, \(C_{ace}\) is the sample abundance coverage estimator, \(F_1\) is the frequency of singletons, and \(\gamma^2_{ace}\) is the estimated coefficient of variation for rare taxa.The estimated coefficient of variation is defined as (assuming
rare_threshold
is 10, the default):\[\gamma^2_{ace}=max\left[\frac{S_{rare}}{C_{ace}} \frac{\sum^{10}_{i=1}{{i\left(i-1\right)}}F_i} {\left(N_{rare}\right)\left(N_{rare}-1\right)} -1,0\right]\]- Parameters:
- counts1-D array_like, int
Vector of counts.
- rare_thresholdint, optional
Threshold at which a taxon containing as many or fewer individuals will be considered rare.
- Returns:
- double
Computed ACE metric.
- Raises:
- ValueError
If every rare taxon is a singleton.
Notes
ACE was first introduced in [1] and [2]. The implementation here is based on the description given in the EstimateS manual [3].
If no rare taxa exist, returns the number of abundant taxa. The default value of 10 for rare_threshold is based on [4].
If
counts
contains zeros, indicating taxa which are known to exist in the environment but did not appear in the sample, they will be ignored for the purpose of calculating the number of rare taxa.References
[1]Chao, A. & S.-M Lee. 1992 Estimating the number of classes via sample coverage. Journal of the American Statistical Association 87, 210-217.
[2]Chao, A., M.-C. Ma, & M. C. K. Yang. 1993. Stopping rules and estimation for recapture debugging with unequal failure rates. Biometrika 80, 193-201.
[4]Chao, A., W.-H. Hwang, Y.-C. Chen, and C.-Y. Kuo. 2000. Estimating the number of shared species in two communities. Statistica Sinica 10:227-246.