Struct RosaPlus

Source

pub struct RosaPlus { /* private fields */ }

Implementations§

Source §

pub fn build_lm_no_finalize_endpos(&mut self)

Build the language model without mutating SAM endpos.

This is useful when you want to reuse a trained SAM as a stable base state (e.g. universal-prior conditioning) and need cheap checkpoint/restore via truncation.

Note: entropy/cross-entropy estimation does not require endpos finalization.

Source

pub fn build_lm_full_bytes_no_finalize_endpos(&mut self)

Build an LM with a fixed byte alphabet of size 256.

This avoids alphabet growth issues and enables fast incremental updates.

Source

pub fn begin_tx(&mut self) -> RosaTx

Begin a reversible conditional update transaction.

Source

pub fn train_example_tx(&mut self, tx: &mut RosaTx, s: &[u8])

Apply a training example and update LM counts incrementally (byte alphabet must be full 256).

Source

pub fn train_sequence_tx(&mut self, tx: &mut RosaTx, s: &[u8])

Apply a sequential update without inserting a boundary (continuous stream).

Source

pub fn rollback_tx(&mut self, tx: RosaTx)

Roll back a transaction, restoring the model to the exact state at begin_tx.

Source

pub fn ensure_lm_built_no_finalize_endpos(&mut self)

Ensure the LM is built (without mutating SAM endpos).

Source

pub fn lm_alpha_n(&self) -> usize

Current LM alphabet size (0 if LM not built).

Source

pub fn estimated_size_bytes(&self) -> usize

Source

pub fn shrink_aux_buffers(&mut self)

Source

pub fn fork_from_sam(&self) -> Self

Create a new model that shares the same trained SAM state but resets LM-related buffers.

This is substantially cheaper than cloning the full RosaPlus (which includes LM counts, node tables, and distribution buffers) and is safe for workflows that want to start from a fixed base training text (e.g. a universal prior) and then add candidate-specific text.

Source

pub fn checkpoint(&self) -> RosaCheckpoint

A checkpoint that allows restoring the ROSA model back to a previous trained state by truncating append-only internal buffers.

Intended for workflows that repeatedly evaluate different continuations from the same base training text (e.g. universal-prior conditioned scoring).

Source

pub fn restore(&mut self, ck: &RosaCheckpoint)

Restore the model to a previously captured checkpoint.

This invalidates the LM; callers should rebuild it before scoring.

Source

pub fn generate(&mut self, prompt: &[u8], steps: i32) -> Option<Vec<u8>>

Source

pub fn get_distribution(&mut self, context: &[u8]) -> Vec<(u32, f64)>

Returns the probability distribution for the next symbol given a context. Output: Vec of (codepoint, probability) pairs, sorted by codepoint. Builds the LM if not already built.

Source