OmniCoder-9B-Zero-Phase2Prime (v10)

CARL (Coherence-Aware Reinforcement Learning) LoRA adapter by Intuition Labs / Tej Desai.

Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.

Eval Results (2026-04-09)

Metric Value
Task completion 92%
Tool format compliance 99%
Mean tool calls 11.09
Individual tool failure rate 43% (recovers via retry)
Mean tokens 1441
Phase 2' Gate PASS

Training

  • Base model: Tesslate/OmniCoder-9B
  • Method: GRPO with CodingSandboxEnv (real subprocess execution)
  • Steps: 80 | Generations: 2 per prompt
  • Rewards: 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
  • LoRA: r=64, alpha=128, targets=qkvo+gate+up+down

Usage

CARL Naming

This adapter is also available as a merged model at wheattoast11/il-terminals-carl-omni9b-v10 (pending).

Pattern: il-terminals-carl-{base}-{tag} | Intuition Labs

Papers

  • Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
  • Semantic Realizability (DOI: 10.5281/zenodo.18992031)
Downloads last month
635
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wheattoast11/OmniCoder-9B-Zero-Phase2Prime

Finetuned
Qwen/Qwen3.5-9B
Adapter
(4)
this model