pub struct RosaPlus { /* private fields */ }Expand description
ROSA+ predictive model with optional transactional updates.
Implementations§
Source§impl RosaPlus
impl RosaPlus
Sourcepub fn new(max_order: i64, use_eot: bool, eot_char: u8, seed: u64) -> Self
pub fn new(max_order: i64, use_eot: bool, eot_char: u8, seed: u64) -> Self
Create a new ROSA+ model.
max_order < 0 enables adaptive order selection in predictive scoring.
Sourcepub fn train_example(&mut self, s: &[u8])
pub fn train_example(&mut self, s: &[u8])
Train on one byte sequence, optionally appending EOT marker.
Sourcepub fn reserve_for_stream(&mut self, additional_bytes: usize)
pub fn reserve_for_stream(&mut self, additional_bytes: usize)
Reserve append-only buffers for a future byte stream.
This keeps compression-side updates from repeatedly growing the same vectors.
Sourcepub fn build_lm_no_finalize_endpos(&mut self)
pub fn build_lm_no_finalize_endpos(&mut self)
Build the language model without mutating SAM endpos.
This is useful when you want to reuse a trained SAM as a stable base state (e.g. universal-prior conditioning) and need cheap checkpoint/restore via truncation.
Note: entropy/cross-entropy estimation does not require endpos finalization.
Sourcepub fn build_lm_full_bytes_no_finalize_endpos(&mut self)
pub fn build_lm_full_bytes_no_finalize_endpos(&mut self)
Build an LM with a fixed byte alphabet of size 256.
This avoids alphabet growth issues and enables fast incremental updates.
Sourcepub fn train_example_tx(&mut self, tx: &mut RosaTx, s: &[u8])
pub fn train_example_tx(&mut self, tx: &mut RosaTx, s: &[u8])
Apply a training example and update LM counts incrementally (byte alphabet must be full 256).
Sourcepub fn train_sequence_tx(&mut self, tx: &mut RosaTx, s: &[u8])
pub fn train_sequence_tx(&mut self, tx: &mut RosaTx, s: &[u8])
Apply a sequential update without inserting a boundary (continuous stream).
Sourcepub fn train_sequence(&mut self, s: &[u8])
pub fn train_sequence(&mut self, s: &[u8])
Apply a sequential byte-stream update without rollback bookkeeping.
Sourcepub fn train_byte(&mut self, b: u8)
pub fn train_byte(&mut self, b: u8)
Apply a single byte sequential update without rollback bookkeeping.
Sourcepub fn reset_conditioning_cursor(&mut self)
pub fn reset_conditioning_cursor(&mut self)
Reset only the predictive cursor while preserving the trained SAM/LM.
Sourcepub fn advance_conditioning_byte(&mut self, b: u8)
pub fn advance_conditioning_byte(&mut self, b: u8)
Advance only the predictive cursor without mutating fitted counts.
Sourcepub fn rollback_tx(&mut self, tx: RosaTx)
pub fn rollback_tx(&mut self, tx: RosaTx)
Roll back a transaction, restoring the model to the exact state at begin_tx.
Sourcepub fn ensure_lm_built_no_finalize_endpos(&mut self)
pub fn ensure_lm_built_no_finalize_endpos(&mut self)
Ensure the LM is built (without mutating SAM endpos).
Sourcepub fn lm_alpha_n(&self) -> usize
pub fn lm_alpha_n(&self) -> usize
Current LM alphabet size (0 if LM not built).
Sourcepub fn estimated_size_bytes(&self) -> usize
pub fn estimated_size_bytes(&self) -> usize
Approximate in-memory footprint of major model buffers.
Sourcepub fn shrink_aux_buffers(&mut self)
pub fn shrink_aux_buffers(&mut self)
Shrink auxiliary scratch buffers to fit current usage.
Sourcepub fn fork_from_sam(&self) -> Self
pub fn fork_from_sam(&self) -> Self
Create a new model that shares the same trained SAM state but resets LM-related buffers.
This is substantially cheaper than cloning the full RosaPlus (which includes LM counts,
node tables, and distribution buffers) and is safe for workflows that want to start from
a fixed base training text (e.g. a universal prior) and then add candidate-specific text.
Sourcepub fn checkpoint(&self) -> RosaCheckpoint
pub fn checkpoint(&self) -> RosaCheckpoint
A checkpoint that allows restoring the ROSA model back to a previous trained state by truncating append-only internal buffers.
Intended for workflows that repeatedly evaluate different continuations from the same base training text (e.g. universal-prior conditioned scoring).
Sourcepub fn restore(&mut self, ck: &RosaCheckpoint)
pub fn restore(&mut self, ck: &RosaCheckpoint)
Restore the model to a previously captured checkpoint.
This invalidates the LM; callers should rebuild it before scoring.
Sourcepub fn generate(&mut self, prompt: &[u8], steps: i32) -> Option<Vec<u8>>
pub fn generate(&mut self, prompt: &[u8], steps: i32) -> Option<Vec<u8>>
Generate continuation bytes from a prompt.
Returns None if LM is not built yet.
Sourcepub fn get_distribution(&mut self, context: &[u8]) -> Vec<(u32, f64)>
pub fn get_distribution(&mut self, context: &[u8]) -> Vec<(u32, f64)>
Returns the probability distribution for the next symbol given a context. Output: Vec of (codepoint, probability) pairs, sorted by codepoint. Builds the LM if not already built.
Sourcepub fn predictive_entropy_rate(&mut self, data: &[u8]) -> f64
pub fn predictive_entropy_rate(&mut self, data: &[u8]) -> f64
Compute the predictive entropy rate (bits per symbol) of the given data.
Uses chunked prequential scoring (train on past chunks, score next chunk).
Sourcepub fn entropy_rate_cps(&mut self, cps: &[u32]) -> f64
pub fn entropy_rate_cps(&mut self, cps: &[u32]) -> f64
Predictive entropy rate on codepoint streams.
Sourcepub fn cross_entropy(&self, data: &[u8]) -> f64
pub fn cross_entropy(&self, data: &[u8]) -> f64
Cross entropy of byte data under current LM state.
Sourcepub fn cross_entropy_cps(&self, data: &[u32]) -> f64
pub fn cross_entropy_cps(&self, data: &[u32]) -> f64
Cross entropy of codepoint data under current LM state.
Sourcepub fn marginal_distribution(&self) -> Vec<(u32, f64)>
pub fn marginal_distribution(&self) -> Vec<(u32, f64)>
Returns the marginal (unigram) distribution over the training data. Output: Vec of (codepoint, probability) pairs, sorted by codepoint.
Sourcepub fn marginal_entropy(&self) -> f64
pub fn marginal_entropy(&self) -> f64
Compute the marginal entropy H(X) from the unigram distribution. Returns bits per symbol.
Sourcepub fn prob_for_last(&mut self, sym: u32) -> f64
pub fn prob_for_last(&mut self, sym: u32) -> f64
Probability of sym from current SAM cursor (sam.last).
Sourcepub fn fill_probs_for_last_bytes(&mut self, out: &mut [f64])
pub fn fill_probs_for_last_bytes(&mut self, out: &mut [f64])
Fill a dense byte-wise probability vector for the current SAM cursor (sam.last).
out must have length at least 256. The output is normalized.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for RosaPlus
impl RefUnwindSafe for RosaPlus
impl Send for RosaPlus
impl Sync for RosaPlus
impl Unpin for RosaPlus
impl UnwindSafe for RosaPlus
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Conv for T
impl<T> Conv for T
§impl<T> FmtForward for T
impl<T> FmtForward for T
§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.§fn fmt_list(self) -> FmtList<Self>where
&'a Self: for<'a> IntoIterator,
fn fmt_list(self) -> FmtList<Self>where
&'a Self: for<'a> IntoIterator,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read more§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read more§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.§impl<T> Pointable for T
impl<T> Pointable for T
§impl<T> Tap for T
impl<T> Tap for T
§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read more§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read more§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read more§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read more§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read more§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read more§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.