Expand description
§Datagen: Synthetic Data Generators for Validation
This module provides generators that produce synthetic data with known theoretical entropy values. Useful for validating entropy estimators and creating reproducible test cases.
§Example
use infotheory::datagen::{bernoulli, uniform_random};
// Bernoulli(0.5) has H(p) = 1.0 bit
let data = bernoulli(1000, 0.5, 42);
// Uniform random bytes have H = 8.0 bits/byte
let uniform = uniform_random(1000, 42);Functions§
- bernoulli
- Generate a Bernoulli bit sequence with probability
pof each bit being 1. - bernoulli_
entropy - Theoretical entropy for a Bernoulli(p) source in bits.
- deterministic_
func - Generate a functionally dependent pair Y = f(X).
- highly_
compressible - Generate a highly compressible string (repeating pattern).
- identical_
pair - Generate two identical sequences (for testing identity properties).
- independent_
pair - Generate two independent random sequences (for testing independence properties).
- markov_
1_ binary - Generate a first-order binary Markov chain.
- markov_
1_ binary_ entropy_ rate - Theoretical entropy rate for a binary Markov chain with transition probs p00, p11.
- noisy_
channel - Generate a Binary Symmetric Channel (BSC) output.
- periodic
- Generate a periodic sequence by repeating a pattern.
- uniform_
random - Generate uniform random bytes.
- xor_
pair - Generate three sequences X, Y, Z where Z = X XOR Y.