Expand description
RWKV7 model core (weights/state/kernels). High-performance RWKV7 inference kernel.
This module provides a portable SIMD RWKV7 implementation using wide so
rustc/LLVM can pick the best ISA per target (x86_64, aarch64, wasm32 SIMD,
or scalar fallback on non-SIMD CPUs).
§Architecture
- Matrix/vector operations are vectorized via
wide(f32x8) - State updates are optimized for RWKV7 head dimension N=64
- Memory layout is cache-friendly and alignment-aware
- No external BLAS dependencies
Structs§
- Config
- Model configuration.
- Full
Adam State - Adam moments for full-parameter RWKV online training.
- Layer
Profiler - Collects wall-clock timings for each transformer block.
- Layer
Timing - Timing data for a single transformer block.
- Model
- RWKV7 model.
- Null
Profiler - No-op profiler used by default to keep the fast path branch-free.
- Scratch
Buffers - Pre-allocated scratch buffers to avoid allocations in hot path.
- State
- Full model state.
- Tensor1D
- Owned 1D tensor with aligned memory.
- Tensor2D
- Owned 2D tensor with aligned memory (row-major).
- Tensor
View1D - View into external f32 data (for weights).
- Tensor
View2D - View into external f32 data (for weights), row-major.
- Train
Scope Mask - Train-scope mask for RWKV full-parameter online updates.
- Weights
- Container for all loaded RWKV7 model weights.
Traits§
- Profiler
Sink - Sink trait used by the model to surface per-layer timings without committing to a particular profiler implementation.