Harmonic-27B / README.md
DJLougen's picture
Update model card with full Harmonic series documentation
3f1a04d verified
metadata
language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - reasoning
  - qwen3.5
  - conversational
  - unsloth
  - self-correction
  - chain-of-thought
base_model: unsloth/Qwen3.5-27B
pipeline_tag: text-generation

Harmonic-27B

Harmonic-27B

The flagship of the Harmonic series. A reasoning-focused fine-tune of Qwen 3.5 27B trained on the same structurally validated data as Harmonic-9B and Harmonic-2B. Every row passes automated quality gates. No junk, no filler, no shallow traces.

The name comes from harmonic analysis of reasoning patterns — the structural signal that separates genuine thinking from surface-level chain-of-thought.

Training Approach

Same pipeline as Harmonic-9B. 799 curated rows — a small, precisely curated dataset instead of tens of thousands of unfiltered examples. The base model already has the knowledge from pretraining — the fine-tune teaches it a reasoning behavior pattern.

Every training row contains explicit self-correction ("wait, that's not right"), verification ("let me check by plugging back in"), and multi-path exploration ("alternatively, I could try..."). The data was generated from multiple frontier models and filtered through a custom structural quality pipeline that enforces reasoning depth, coherence, and flow patterns. 100% of rows pass all quality gates simultaneously.

Training Data Quality

The same reasoning data as Harmonic-9B and Harmonic-2B, curated using a custom structural process supervision pipeline:

Metric Value
Signal quality score 78.7 mean (61.5 min, 90.0 max)
Thinking trace depth 1,667 words average
Self-correction 100% of rows (17.2 per row avg)
Verification 100% of rows (10.3 per row avg)
Exploration 100% of rows (6.3 per row avg)
Quality gate pass rate 100%

How It Compares

We ran our structural quality analysis against every major public reasoning dataset used for Opus/Qwen distillation. The results:

Dataset Rows Think Words Self-Correction Verification Exploration Signal Score Gate Pass
Harmonic (ours) 799 1,667 100% 100% 100% 78.7 100%
Crownelius/Opus-3300x 2,160 188 5.9% 22.6% 5.2% 28.0 0.1%
nohurry/Opus-Filtered 2,326 191 6.7% 24.1% 5.3% 28.5 0.1%
TeichAI/Opus-250x 250 323 17.2% 26.8% 6.8% 24.6 0.4%
Jackrong/Qwen-700x 633 6,653 97.5% 97.6% 69.8% 75.6 22.7%
Bespoke-Stratos-17k 16,710 1,322 88.2% 72.7% 59.7% 71.7 49.0%
glaiveai/reasoning-20m 22M+ 799 64.1% 41.4% 37.3% 46.2 12.8%
KingNish/reasoning-20k 19,944 132 0.7% 4.2% 4.3% 27.4 0.0%

Speculative Decoding

Harmonic-27B pairs with Harmonic-2B for speculative decoding. Both models share the same training data, reasoning format, and architecture family (Qwen 3.5), which keeps draft token acceptance rates high.

from transformers import AutoModelForCausalLM

target = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-27B")
draft = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-2B")

outputs = target.generate(
    **inputs,
    assistant_model=draft,
    max_new_tokens=512,
)

Training Configuration

base_model: unsloth/Qwen3.5-27B
dataset: 799 curated reasoning rows
epochs: 1
learning_rate: 1e-4
lr_scheduler: cosine
warmup_ratio: 0.1
max_seq_length: 8192
lora_rank: 32
lora_alpha: 32
dropout: 0.05
micro_batch_size: 1
gradient_accumulation_steps: 4
weight_decay: 0.01

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-27B")
tokenizer = AutoTokenizer.from_pretrained("DJLougen/Harmonic-27B")

Reasoning format

The model uses think blocks for reasoning:

<|thinking|>
The user is asking about X. Let me consider two approaches...

Approach 1: ...
Approach 2: ...

I will go with Approach 1 because...

Wait, I need to be careful here - this assumes Y, which may not hold.
Let me verify by checking a special case...

Yes, that confirms the result.
<|/thinking|>

[Final answer here]

Intended Use

  • Reasoning tasks requiring genuine multi-step thinking
  • Mathematical problem-solving with self-correction
  • Code analysis and generation with structured verification
  • General conversation (conversational ability preserved through training design)
  • Target model for speculative decoding with Harmonic-2B
  • Base model for Stage 2 agentic fine-tuning

Limitations

  • Reasoning traces can be verbose for simple questions
  • Not optimized for tool calling — see Harmonic-Hermes-9B for agentic use
  • Benchmark evaluation is ongoing

Architecture

  • Base: Qwen 3.5 27B (27.36B parameters)
  • Training: LoRA fine-tuning, merged into base weights
  • Precision: BF16
  • Context: 8192 tokens

License

Apache 2.0 — same as the base model. All training data is from Apache 2.0 or MIT licensed sources. Fully commercial use permitted.

Links