pub fn entropy_rate_bytes(data: &[u8], max_order: i64) -> f64Expand description
Compute entropy rate Ĥ(X) in bits/symbol using ROSA LM.
This uses ROSA’s context-conditional Witten-Bell model to estimate the entropy rate, which accounts for sequential dependencies.
The estimator is prequential (predictive sequential): it sums the negative log-probability
of each symbol x_t given its past context x_{<t}, estimated from the model trained on x_{<t}.
Ĥ(X) = -1/N * Σ log2 P(x_t | x_{t-k}^{t-1})
For i.i.d. data, this should approximately equal marginal_entropy_bytes.
max_order: Maximum context order for the suffix automaton LM. A value of -1 means unlimited context (bounded only by memory/sequence length).