Crate infotheory

Crate infotheory 

Source
Expand description

§InfoTheory: Information Theoretic Estimators & Metrics

This crate provides a comprehensive suite of information-theoretic primitives for quantifying complexity, dependence, and similarity between data sequences.

It implements two primary classes of estimators:

  1. Compression-based (Kolmogorov Complexity): Using the ZPAQ compression algorithm to estimate Normalized Compression Distance (NCD).
  2. Entropy-based (Shannon Information): Using both exact marginal histograms (for i.i.d. data) and the ROSA (Rapid Online Suffix Automaton) predictive language model (for sequential data) to estimate Entropy, Mutual Information, and related distances.

§Mathematical Primitives

The library implements the following core measures. For sequential data, “Rate” variants use the ROSA model to estimate Ĥ(X) (entropy rate), while “Marginal” variants treat data as a bag-of-bytes (i.i.d.) and compute H(X) from histograms.

§1. Normalized Compression Distance (NCD)

Approximates the Normalized Information Distance (NID) using a compressor C.

NCD(x,y) = (C(xy) - min(C(x), C(y))) / max(C(x), C(y))

§2. Normalized Entropy Distance (NED)

An entropic analogue to NCD, defined using Shannon entropy H.

NED(X,Y) = (H(X,Y) - min(H(X), H(Y))) / max(H(X), H(Y))

§3. Normalized Transform Effort (NTE)

Based on the Variation of Information (VI), normalized by the maximum entropy.

NTE(X,Y) = (H(X|Y) + H(Y|X)) / max(H(X), H(Y)) = (2H(X,Y) - H(X) - H(Y)) / max(H(X), H(Y))

§4. Mutual Information (MI)

Measures the amount of information obtained about one random variable by observing another.

I(X;Y) = H(X) + H(Y) - H(X,Y)

§5. Divergences & Distances

  • Total Variation Distance (TVD): δ(P,Q) = 0.5 * Σ |P(x) - Q(x)|
  • Normalized Hellinger Distance (NHD): sqrt(1 - Σ sqrt(P(x)Q(x)))
  • Kullback-Leibler Divergence (KL): D_KL(P||Q) = Σ P(x) log(P(x)/Q(x))
  • Jensen-Shannon Divergence (JSD): Symmetrized and smoothed KL divergence.

§6. Intrinsic Dependence (ID)

Measures the redundancy within a sequence, comparing marginal entropy to entropy rate.

ID(X) = (H_marginal(X) - H_rate(X)) / H_marginal(X)

§7. Resistance to Transformation

Quantifies how much information is preserved after a transformation T is applied.

R(X, T) = I(X; T(X)) / H(X)

§Usage

use infotheory::{ncd_vitanyi, mutual_information_bytes, NcdVariant};

let x = b"some data sequence";
let y = b"another data sequence";

// Compression-based distance
let ncd = ncd_vitanyi("file1.txt", "file2.txt", "5");

// Entropy-based mutual information (Marginal / i.i.d.)
let mi_marg = mutual_information_bytes(x, y, 0);

// Entropy-based mutual information (Rate / Sequential, max_order=8)
let mi_rate = mutual_information_bytes(x, y, 8);

Modules§

aixi
MC-AIXI Implementation
axioms
Axioms: Mathematical Property Verifiers
ctw
Context Tree Weighting (CTW) and Factorized Action-Conditional CTW (FAC-CTW).
datagen
Datagen: Synthetic Data Generators for Validation
mixture
Online mixtures of probabilistic predictors (log-loss Hedge / Bayes, switching, MDL).

Structs§

InfotheoryCtx
MixtureExpertSpec
Expert specification for mixture backends.
MixtureSpec
Mixture specification for rate-backend mixtures.

Enums§

MixtureKind
Mixture policy kind for rate-backend mixtures.
NcdBackend
NcdVariant
—–– NCD (Normalized Compression Distance) ——
RateBackend

Functions§

biased_entropy_rate_backend
biased_entropy_rate_bytes
Compute biased entropy rate Ĥ_biased(X) bits per symbol.
compress_size_backend
compress_size_chain_backend
conditional_entropy_bytes
Compute conditional entropy H(X|Y) = H(X,Y) − H(Y)
conditional_entropy_paths
Conditional Entropy for files.
conditional_entropy_rate_bytes
Compute conditional entropy rate Ĥ(X|Y).
cross_entropy_bytes
Compute cross-entropy H_{train}(test) - score test_data under model trained on train_data.
cross_entropy_paths
Cross-Entropy for files.
cross_entropy_rate_backend
Cross-entropy H_{train}(test) - score test_data under model trained on train_data.
cross_entropy_rate_bytes
Compute cross-entropy rate using ROSA/CTW/RWKV. Training model on train_data and evaluating probability of test_data.
d_kl_bytes
Kullback-Leibler Divergence D_KL(P || Q) = Σ p(x) log(p(x) / q(x))
entropy_rate_backend
entropy_rate_bytes
Compute entropy rate Ĥ(X) in bits/symbol using ROSA LM.
get_bytes_from_paths
get_compressed_size
—–– Base Compression Functions —––
get_compressed_size_parallel
get_compressed_sizes_from_paths
Optimizes parallelization
get_default_ctx
Returns the current default information theory context for the thread.
get_parallel_compressed_sizes_from_parallel_paths
get_parallel_compressed_sizes_from_sequential_paths
get_sequential_compressed_sizes_from_parallel_paths
get_sequential_compressed_sizes_from_sequential_paths
—–– Bulk File Compression Functions —––
intrinsic_dependence_bytes
Primitive 6: Intrinsic Dependence (Redundancy Ratio).
joint_entropy_rate_backend
joint_entropy_rate_bytes
Compute joint entropy rate Ĥ(X,Y).
joint_marginal_entropy_bytes
Compute joint marginal entropy H(X,Y) = −Σ p(x,y) log₂ p(x,y) in bits/symbol-pair.
js_div_bytes
Jensen-Shannon Divergence JSD(P || Q) = 1/2 D_KL(P || M) + 1/2 D_KL(Q || M) where M = 1/2 (P + Q)
js_divergence_paths
Jensen-Shannon Divergence for files.
kl_divergence_paths
KL Divergence for files.
load_rwkv7_model_from_path
marginal_entropy_bytes
Compute marginal (Shannon) entropy H(X) = −Σ p(x) log₂ p(x) in bits/symbol.
mutual_information_bytes
Compute mutual information I(X;Y) = H(X) + H(Y) - H(X,Y).
mutual_information_marg_bytes
Marginal Mutual Information (exact/histogram)
mutual_information_paths
Mutual Information for files.
mutual_information_rate_backend
mutual_information_rate_bytes
Entropy Rate Mutual Information (ROSA predictive)
ncd_bytes
ncd_bytes_backend
ncd_bytes_default
NCD with bytes using the default context.
ncd_cons
ncd_matrix_bytes
Computes an NCD matrix (row-major, len = n*n) for in-memory byte blobs.
ncd_matrix_paths
Computes an NCD matrix (row-major, len = n*n) for files (preloads all files into memory once).
ncd_paths
ncd_paths_backend
ncd_sym_cons
ncd_sym_vitanyi
ncd_vitanyi
Back-compat convenience wrappers (operate on file paths).
ned_bytes
NED(X,Y) = (H(X,Y) - min(H(X), H(Y))) / max(H(X), H(Y))
ned_cons_bytes
NED_cons(X,Y) = (H(X,Y) - min(H(X), H(Y))) / H(X,Y)
ned_cons_marg_bytes
ned_cons_rate_bytes
ned_marg_bytes
Marginal NED (exact/histogram)
ned_paths
NED for files.
ned_rate_backend
ned_rate_bytes
Normalized Entropy Distance (Rate-based)
nhd_bytes
NHD(X,Y) = sqrt(1 - BC(X,Y)) where BC = Σᵢ sqrt(p_X(i) · p_Y(i))
nhd_paths
NHD for files.
nte_bytes
NTE(X,Y) = VI(X,Y) / max(H(X), H(Y)) where VI(X,Y) = H(X|Y) + H(Y|X) = 2H(X,Y) - H(X) - H(Y).
nte_marg_bytes
nte_paths
NTE for files.
nte_rate_backend
nte_rate_bytes
resistance_to_transformation_bytes
Primitive 7: Resistance under Allowed Transformations.
set_default_ctx
Sets the default information theory context for the thread.
tvd_bytes
TVD_marg(X,Y) = (1/2) Σᵢ |p_X(i) - p_Y(i)|
tvd_paths
TVD for files.
validate_zpaq_rate_method