skbio.util.get_rng#

skbio.util.get_rng(seed=None)[source]#

Get a random generator.

Changed in version 0.6.3: Added legacy random generator support.

Parameters:
seedint, Generator or RandomState, optional

A user-provided random seed or random generator instance.

Returns:
np.random.Generator

Random generator instance.

Notes

A random generator ensures reproducibility of outputs. scikit-bio utilizes NumPy’s new random generator (Generator) [1], which was introduced in version 1.17. See NEP 19 [3] for an introduction to this change.

The following code demonstrates the recommended usage of the random generator with various scikit-bio functions that are stochastic. With a random generator created in advance, you can plug it into multiple function calls. The results of the entire code will be reproducible.

(42 is an arbitrarily chosen random seed. It can be any non-negative integer.)

rng = np.random.default_rng(42)
skbio_func1(..., seed=rng)
skbio_func2(..., seed=rng)
...

Alternatively, you may specify an integer seed to make the result of each function call reproducible, such as:

skbio_func1(..., seed=42)
skbio_func2(..., seed=42)
...

Meanwhile, scikit-bio respects the legacy random generator (RandomState) [4]. If np.random.seed has been called, or a RandomState instance is provided, scikit-bio will create a new random generator from a seed selected by the legacy random generator. This ensures reproducibility of legacy code and compatibility with packages that use the legacy mechanism. For example:

np.random.seed(42)
skbio_func1(...)
skbio_func2(...)
...

It should be noted that the legacy random generator will not be directly used by scikit-bio functions for random number generation. Only the new random generator will be exposed to the functions.

References