skbio.diversity.alpha.shannon#

skbio.diversity.alpha.shannon(counts, base=None, exp=False)[source]#

Calculate Shannon’s diversity index.

Shannon’s diversity index, \(H'\), a.k.a., Shannon index, or Shannon- Wiener index, is equivalent to entropy in information theory. It is defined as:

\[H' = -\sum_{i=1}^S\left(p_i\log_b(p_i)\right)\]

where \(S\) is the number of taxa and \(p_i\) is the proportion of the sample represented by taxon \(i\).

The logarithm base \(b\) defaults to e, but may be 2, 10 or other custom values.

The exponential of Shannon index, \(exp(H')\), measures the effective number of species (a.k.a., true diversity). It is equivalent to perplexity in information theory, or Hill number with order 1 (\(^1D\)). The value is independent from the base:

\[exp(H') = b ^ {-\sum_{i=1}^S\left(p_i\log_b(p_i)\right)} = \prod_{i=1} ^{S}p_i^{-p_i}\]
Parameters:
counts1-D array_like, int

Vector of counts.

baseint or float, optional

Logarithm base to use in the calculation. Default is e.

Changed in version 0.6.1: The default logarithm base was changed from 2 to \(e\) for consistency with the majority of literature.

expbool, optional

If True, return the exponential of Shannon index.

Returns:
float

Shannon’s diversity index.

Notes

Shannon index (i.e., entropy) was originally proposed in [1]. The exponential of Shannon index (i.e., perplexity) was discussed in [2] in the context of community diversity.

References

[1]

Shannon, C. E. (1948). A mathematical theory of communication. The Bell system technical journal, 27(3), 379-423.

[2]

Jost, L. (2006). Entropy and diversity. Oikos, 113(2), 363-375.