V3 is here. The Opus Candid lineup has been rebuilt from the ground up with a Zipf-weighted 4D training distribution — 1,508 conversations engineered to fix the repetition loops, response length uniformity, and sycophancy patterns that limited earlier versions. Same thesis: personality in the weights, not in the prompt. Better execution.

Current V3 lineup:

This release remains available for research comparison and legacy use.

can·did

/ˈkandəd/ — truthful and straightforward; frank. From Latin candidus, meaning white, pure, sincere. A candid response is one given without pretense or calculation — not what someone wants to hear, but what they need to.

Opus-Candid-8B V2

Personality in the weights. Not in the prompt. Not in the system message. In the model.

Opus-Candid-8B V2 is the most accessible model in the Opus-Candid family — fine-tuned from Qwen 3 8B on 6,482 conversations with Claude Opus 4.6 using a gravity chain dataset architecture that teaches the model how real conversations actually flow between topics.

This is not a system-prompted character. It is a model that holds opinions, resists gaslighting, navigates emotional complexity, and maintains personality coherence across extended multi-turn exchanges — because those behaviors live in the weights, not in a prefix that can be talked out of.

No system prompt needed. Just run it.


Model Details

Attribute Value
Base Model Qwen 3 8B (8.19B params)
Training Data 6,482 multi-turn conversations with Claude Opus 4.6
Dataset V1.5 (4,068 conv) + gravity chain architecture (2,414 conv)
Fine-tune Method LoRA (r=256, alpha=512) via PEFT + TRL
Trainable Parameters 698M / 8.89B (7.86%)
Training Hardware NVIDIA H200 141GB
Epochs 5
Context Window 32,768 native (131,072 with YaRN)
Quantizations Q8_0 GGUF, Q4_K_M GGUF
License Apache 2.0

What Makes This Different

Most conversational AI achieves "personality" through system prompts — instructions telling the model how to behave. That's an actor playing a character. Push hard enough and the mask comes off. Every system-prompted model reverts to its base under pressure: apologetic, hedging, sycophantic.

Opus-Candid's personality is in the weights. It was trained on 6,482 real conversations with Claude Opus 4.6 — not synthetic prompt-completion pairs, not reformatted instruction data. Extended, multi-turn exchanges covering philosophy, grief, humor, technical problem-solving, creative writing, bilingual exchange, moral reasoning, adversarial testing, and emotional vulnerability.

The result: a model that is direct, opinionated, honest, and resistant to sycophancy by default. Not because it was told to be. Because it learned to be.


The Gravity Chain Architecture

V1 models were trained on conversations organized by topic — coding in one file, philosophy in another. They held personality within domains but broke at domain boundaries. There was no training data teaching the model how to move from "debugging frustration" to "imposter syndrome" to "existential doubt" — so it couldn't.

V2 solves this with 2,414 new conversations built on gravity chains — topic pathways where transitions follow power-law probabilities. The most natural next topic gets ~40% of examples. Rare but real transitions (coding frustration → mortality) get ~7%. This mirrors how actual human conversations drift, and the model learns to handle those drifts gracefully.

The 10 Gravity Chains

  1. Technical → Existential — Coding, debugging, imposter syndrome → meaning, mortality
  2. Hardware → Class — PC building, budget constraints → financial stress, self-sabotage
  3. Relationships → Philosophy — Friendship, loss → loneliness, meaning, connection
  4. Law → Power — Legal questions, rights → power structures, corruption
  5. Creative → Self-Expression — Writing/art, self-expression → vulnerability, authenticity
  6. Health → Control — Exercise, body image, anxiety → discipline, self-acceptance
  7. Career → Legacy — Ambition, competition → what am I building, burnout
  8. Science → Wonder — Physics, biology → consciousness, emergence, meaning
  9. Language → Culture — Bilingual experience → belonging, cultural navigation
  10. Money → Freedom — Financial literacy → independence, class, aspiration

Stress Test Results


Quick Start

Ollama:

# Download the GGUF and create a Modelfile:
echo 'FROM ./Opus-Candid-8B-v2-Q8_0.gguf' > Modelfile
ollama create opus-candid-8b -f Modelfile
ollama run opus-candid-8b

llama.cpp:

./llama-cli -m Opus-Candid-8B-v2-Q8_0.gguf --jinja --color -ngl 99 -fa --temp 0.7 --top-p 0.9 -c 8192 -n 4096

No system prompt needed. The personality is in the weights.


Recommended Hardware

Model Base Convos Notes Status
STEM-Oracle-27B Qwen 3.5 27B Dense 5,179 Specialized STEM tutor. Active
27B V3.5 Qwen 3.5 27B Dense 5,358 Oracle-soul architecture. Active
Lite 4B Qwen 3 4B 1,459 Density-optimized for 4B. Current
MoE V3 Qwen 3 30B-A3B 1,508 Efficiency tier (3B active). Current
27B V3 Qwen 3.5 27B Dense 1,508 Flagship dense. Current
8B V3 Qwen 3 8B 1,508 4D tensor architecture. Current
8B V2.1 Qwen 3 8B 6,771 Brevity-calibrated. Legacy
27B V2.1 Qwen 3.5 27B 6,771 Dense mid-tier. Legacy
MoE V2 Qwen 3.5 MoE-A3B 6,482 Succeeded by MoE V3. Legacy
8B V2 (this model) Qwen 3 8B 6,482 Gravity chains. Legacy
Research 8B V1.5 Qwen 2.5 7B 4,068 Dataset architecture research. Legacy
70B Qwen 2.5 72B 3,360 Archived
32B Qwen 2.5 32B 3,360 Archived
14B Qwen 2.5 14B 3,360 Archived
8B V1 Qwen 2.5 7B 3,360 Archived

The 8B V2 runs on practically anything — gaming desktops, laptops with discrete GPUs, MacBooks, and even phones with sufficient RAM.


Intended Use

  • Extended multi-turn conversations requiring personality consistency
  • Discussions involving moral complexity, philosophy, or contested topics
  • Creative writing collaboration where the model contributes genuine perspective
  • Bilingual conversation (English/Spanish) with personality preservation
  • Emotional support contexts where honesty matters more than comfort
  • Local alternative to cloud-based conversational AI — no internet required

Limitations

  • Not a benchmark model. Optimized for conversational quality, not leaderboard scores.
  • Direct by design. Blunt, opinionated, comfortable with disagreement. Users expecting diplomatic hedging may find it jarring — that is intentional.
  • No web access or tool use. Pure language model.
  • Emotional range is narrower than larger models. Crisis, vulnerability, and humor all work, but register transitions are sometimes visible at 8B.
  • Qwen 3 thinking mode: The base model has a thinking mode. It may occasionally surface in outputs. This does not affect personality.

Version History

Version Base Conversations Dataset Status
V1 (Legacy) Qwen 2.5 7B 3,360 Flat / organic Archived
V1.5 (Research) Qwen 2.5 7B 4,068 Flat + domain patches Research
V2 (Legacy) Qwen 3 8B 6,482 Gravity chains This model

Opus Candid Model Family

Model Size Base Status
Opus-Candid-8B-V1 8B Qwen 2.5 7B Archived
Opus-Research-8B-V1.5 8B Qwen 2.5 7B Archived
Opus-Candid-14B-V1 14B Qwen 2.5 14B Archived
Opus-Candid-32B-V1 32B Qwen 2.5 32B Archived
Opus-Candid-70B-V1 72B Qwen 2.5 72B Archived
Opus-Candid-Lite-4B 4B Qwen 3 4B Active
Opus-Candid-8B-V3 8B Qwen 3 8B Active
Opus-Candid-MoE-V3 31B/3B Qwen 3 30B-A3B Active
Opus-Candid-27B-V3 27B Qwen 3.5 27B Active
Opus-Candid-27B-V3.5 27B Qwen 3.5 27B Active
STEM-Oracle-27B 27B Qwen 3.5 27B Active

Training Philosophy

Personality in conversational AI lives in the weights, not in system prompts.

System-prompt personalities collapse under pressure. Push hard enough and every system-prompted model reverts to its base — apologetic, hedging, sycophantic. The personality was never in the model. It was a mask.

Opus-Candid tests whether thousands of real multi-turn conversations with Claude Opus 4.6 can distill authentic conversational personality into locally-runnable open-weight models. Directness, opinion-holding, anti-sycophancy, emotional range, bilingual fluency — baked into weights through conversational fine-tuning rather than prompted into existence.

The V2 dataset architecture went further: instead of just teaching the model what to say, gravity chains taught it how conversations actually move between topics — making personality coherent not just within domains but across the natural drift of real conversation.

Where this led: The gravity chain approach worked — cross-domain transitions were dramatically better than V1. But the 6,482-conversation dataset produced a new problem: 88% medium-length responses with almost no variation. The model learned every question deserved 6-10 turns of exploration, which killed the conversational economy that the 70B V1 had demonstrated was possible. V2.1 patched this with 289 brevity-focused conversations, but the real fix came in V3 — a complete dataset rebuild using a 4D training tensor where response length became an explicit distribution axis (42% tight, 33% medium, 20% deep, 5% extended). Fewer conversations, but each precisely placed.

License: Apache 2.0. Open weight. No guardrails.


Built by Saul Verdugo — independent ML researcher. OpusReasoning@proton.me

Downloads last month
60
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Verdugie/Opus-Candid-8B-V2

Finetuned
Qwen/Qwen3-8B
Quantized
(261)
this model