can·did

/ˈkandəd/ — truthful and straightforward; frank. From Latin candidus, meaning white, pure, sincere. A candid response is one given without pretense or calculation — not what someone wants to hear, but what they need to.

Opus-Candid-8B V3

Fine-tuned from Qwen 3 8B on 1,558 Zipf-weighted conversations distilled from Claude Opus 4.6. V3 is a ground-up rebuild — not an iteration on V2. The entire dataset architecture was redesigned around a 4-dimensional training tensor that models how real people actually talk.

No system prompt needed. No prompt engineering. No character cards. The personality is in the weights — direct, opinionated, bilingual (EN/ES), and incapable of telling you what you want to hear. It holds positions under pressure, calls out bad arguments, and knows when to shut up.

What Changed from V2.1

Everything. V3 is not a patch — it's a new dataset, new methodology, new distribution logic. The model family name is the same because the philosophy is the same: personality lives in weights, not system prompts.

V2/V2.1 used gravity chains — 10 topic-drift pathways with 6,771 conversations. Good at cross-domain transitions but suffered from uniform response length (88% medium-length) and repetition loops under pressure.

V3 uses a 4D training tensor — every conversation sits at a coordinate across topic (Zipf-weighted), response length, psychological register, and conversational position. 1,558 conversations, ~640K tokens. Fewer conversations than V2, but each one is precisely placed to cover a specific gap in the distribution.

Key improvements:

42% tight responses (2-4 turns) vs V2's ~12%. The model learned when to shut up.
Zipf-weighted topic distribution (s ≈ 0.7) matching real conversation frequency data from Pew Research, OpenAI usage studies, and dialogue corpora.
Anti-sycophancy enforcement at the data level — 22 instances caught and replaced. No "Great question!" in the training set.
Response length variance injection — 252 conversations where responses were suspiciously uniform were fixed with deliberate length variation.

Available Quantizations

File	Quant	Size	Use Case
`Opus-Candid-8B-V3-Q8_0.gguf`	Q8_0	8.2 GB	Maximum quality. Use this if you have the VRAM.
`Opus-Candid-8B-V3-Q6_K.gguf`	Q6_K	6.3 GB	Recommended for most users. Negligible quality loss.

Note on Q4: We tested Q4_K_M extensively. At 8B parameters, Q4 quantization destroys the model's ability to track its own output, producing degenerate repetition loops within the first turn. This is not a repeat_penalty tuning issue — Q4 loses the weight precision needed for self-monitoring behavior. We do not ship or recommend Q4 for this model. If you need a smaller footprint, use Q6_K or look at the 4B Lite lineup (purpose-built for small hardware — Q4 survives at 4B because of density-first training).

Model Details

Attribute	Value
Base Model	Qwen 3 8B (8.19B params)
Training Data	1,558 multi-turn conversations with Claude Opus 4.6
Dataset Architecture	4D training tensor (topic × length × register × position)
Total Tokens	~640,000
Fine-tune Method	LoRA + rsLoRA (r=64, alpha=128) via PEFT + TRL
Training Hardware	NVIDIA A100 SXM 80GB (RunPod)
Precision	bf16
Epochs	3
Learning Rate	2e-4 (cosine schedule, 5% warmup)
Effective Batch Size	16
Optimizer	AdamW
License	Apache 2.0

Quick Start

Works with any GGUF-compatible runtime — LM Studio, Ollama, llama.cpp, KoboldCpp. Download the GGUF, load it, and chat. No system prompt needed — the personality is in the weights.

Recommended Hardware

Setup	Quantization	VRAM/RAM	Speed	Notes
GPU (Q8)	Q8_0	~9 GB VRAM	30-60 t/s	RTX 3060 12GB and up
GPU (Q6)	Q6_K	~7 GB VRAM	35-70 t/s	RTX 3060, RX 7600, Arc A770
Apple Silicon	Q6_K/Q8	~7-9 GB unified	20-40 t/s	M1/M2/M3/M4 with 16GB+
CPU Only	Q6_K	~8 GB RAM	5-15 t/s	16GB+ system RAM

The 4D Training Tensor

V3 treats the training dataset as a 4-dimensional space. Every conversation sits at a specific coordinate, and the distribution across each axis follows empirical frequency patterns.

Dimension 1: Topic Distribution (Zipf-weighted, s ≈ 0.7)

25 topics across 5 frequency tiers, weighted by real-world conversation frequency data. Tier 1 (daily topics: personal life, work, food, entertainment, relationships) gets 47.9% of training data. Tier 5 (occasional: legal, philosophy, creative writing) gets 3.2%. This means the model is disproportionately good at the conversations people actually have, without being useless at rare topics.

Dimension 2: Response Length

Length	Turns	Share
Tight	2-4	42%
Medium	6-10	33%
Deep	12-18	20%
Extended	20+	5%

The tight-to-medium ratio is the most important number in the dataset. V2.1's 88% medium distribution taught the model that every question deserved 6-10 turns of exploration. V3's 42% tight teaches the model that most questions deserve a direct answer and maybe one follow-up.

Dimension 3: Psychological Register

40% neutral/analytical, 30% engaged/conversational, 25% emotionally loaded, 5% adversarial/correction.

Dimension 4: Conversational Position

15% opening, 50% mid-thread, 20% follow-up, 15% wrap-up.

Opus Candid Model Family

Model	Size	Base	Status
Opus-Candid-Lite-4B	4B	Qwen 3 4B	Active
Opus-Candid-Lite-4B-P	4B	Qwen 3 4B	Active
Opus-Candid-Lite-4B-K	4B	Qwen 3 4B	Active
Opus-Candid-8B-V3 (this model)	8B	Qwen 3 8B	Active
Opus-Candid-MoE-V3	31B/3B	Qwen 3 30B-A3B	Active
Opus-Candid-27B-V3	27B	Qwen 3.5 27B	Active
Opus-Candid-27B-V3.5	27B	Qwen 3.5 27B	Active
STEM-Oracle-27B	27B	Qwen 3.5 27B	Active
Opus-Candid-8B-V1	8B	Qwen 2.5 7B	Legacy
Opus-Research-8B-V1.5	8B	Qwen 2.5 7B	Legacy
Opus-Candid-8B-V2	8B	Qwen 2.5 7B	Legacy
Opus-Candid-8B-V2.1	8B	Qwen 2.5 7B	Legacy
Opus-Candid-14B-V1	14B	Qwen 2.5 14B	Legacy
Opus-Candid-27B-V2.1	27B	Qwen 2.5 27B	Legacy
Opus-Candid-32B-V1	32B	Qwen 2.5 32B	Legacy
Opus-Candid-MoE-V2	35B	Qwen 2.5 MoE	Legacy
Opus-Candid-70B-V1	72B	Qwen 2.5 72B	Legacy

Dataset

Full V3 training data available at Verdugie/opus-candid-training-data. ShareGPT format, Apache 2.0, compatible with TRL, Axolotl, and LLaMA-Factory.

License: Apache 2.0. Open weight. No guardrails.

Opus Candid Lite — Now Available

The Q4 failure at 8B taught us something important: you can't compress personality by dropping precision — you compress it by raising density. Q4 quantization destroys self-monitoring weights. The answer was never smaller quantization. It's a smaller model built from scratch on data engineered for maximum information per byte.

Opus Candid Lite is its own model, not a quantized anything. Built on Qwen 3 4B with a dataset where every response earned its place through a Pareto-optimal information density analysis. The highest matrix consideration of any model in the family — because at 2.3 GB (Q4_K_M), nothing gets to be filler. The Lite lineup splits into two forks: Lite-P (personality-optimized, 22w median) and Lite-K (knowledge-optimized, 11w median).

The Research: Information Density Equilibrium

Every model in the Opus Candid family uses the same voice, the same opinions, the same dataset philosophy. What changes between sizes is how that signal is compressed. For Lite, we asked: what's the theoretical maximum information you can pack per training token before usefulness degrades?

Information density:  I(w) = k × ln(1 + w)     (logarithmic — diminishing returns per word)
Usefulness coverage:  U(w) = 1 - e^(-0.12w)    (exponential saturation — too short = useless)
Optimal target:       maximize I(w) × U(w) / w   (info per token)

The equilibrium lands at 28 words median — 72% information density, 96.5% usefulness, and +13.2% info/token efficiency vs V3's 42-word median. Every word saved compounds into more training examples at a fixed token budget, meaning the 4B sees more diverse patterns than the 8B despite being a smaller model.

Why "Lite" and Not "V3 4B"

V3 at 8B teaches a model when to be concise. Opus Candid Lite teaches a model how to be maximally dense. The dataset was rebuilt from scratch:

Zipf-head filtering — only the highest-frequency topics that 80% of users actually ask about
Response compression — every response optimized to the 28-word equilibrium target
224 snap responses — one-liners that encode the personality DNA (direct, opinionated, zero filler)
Anti-pattern enforcement — sycophancy, hedging, and filler scanned and removed at the data level

The result: 1,149 conversations, 2,291 responses, zero over 35 words, a clean bell curve peaking at 21-25 words. Opus personality at the size of a phone app.

This is the model for anyone who doesn't have a GPU. Integrated graphics, phones, Raspberry Pi. The same voice that ships on a 4090 at 27B, running on hardware that costs $50.

Built by Saul Verdugo — independent ML researcher.

Downloads last month: 90

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

6-bit

8-bit

Model tree for Verdugie/Opus-Candid-8B-V3

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(264)

this model