OmniCoder-9B-Zero-Phase2Prime (v10)
CARL (Coherence-Aware Reinforcement Learning) LoRA adapter by Intuition Labs / Tej Desai.
Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.
Eval Results (2026-04-09)
| Metric | Value |
|---|---|
| Task completion | 92% |
| Tool format compliance | 99% |
| Mean tool calls | 11.09 |
| Individual tool failure rate | 43% (recovers via retry) |
| Mean tokens | 1441 |
| Phase 2' Gate | PASS |
Training
- Base model: Tesslate/OmniCoder-9B
- Method: GRPO with CodingSandboxEnv (real subprocess execution)
- Steps: 80 | Generations: 2 per prompt
- Rewards: 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
- LoRA: r=64, alpha=128, targets=qkvo+gate+up+down
Usage
CARL Naming
This adapter is also available as a merged model at wheattoast11/il-terminals-carl-omni9b-v10 (pending).
Pattern: il-terminals-carl-{base}-{tag} | Intuition Labs
Papers
- Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
- Semantic Realizability (DOI: 10.5281/zenodo.18992031)
- Downloads last month
- 635
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support