☕ Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. It's a hobby that got out of hand. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

☕ ko-fi.com/djlougen

Harmonic-Hermes-9B-MLX-8bit

8-bit MLX conversion of Harmonic-Hermes-9B for local inference on Apple Silicon with mlx-lm.

Quantization	Size	Use Case
8-bit	~8.9 GB	Near-lossless quality, 16GB+ unified memory

Other formats

Format	Repo
GGUF (all quants)	Harmonic-Hermes-9B-GGUF
MLX 4-bit	Harmonic-Hermes-9B-MLX-4bit
MLX 8-bit	Harmonic-Hermes-9B-MLX-8bit
MLX BF16	Harmonic-Hermes-9B-MLX-bf16
Full weights	Harmonic-Hermes-9B

Harmonic-Hermes-9B is the Stage 2 agentic fine-tune of Harmonic-9B — a dedicated tool-calling and agent model built on top of a strong reasoning backbone.

Where Harmonic-9B teaches the model how to think, Harmonic-Hermes-9B teaches it how to act — structured tool use, multi-turn agent workflows, and function calling, all grounded in the reasoning depth from Stage 1.

Stage 1 — Harmonic-9B: Heavy reasoning fine-tune on privately generated, structurally validated data. Every row passes strict quality gates. The thinking backbone.

Stage 2 (this model): Agentic fine-tune on hermes-agent-traces-filtered — 3,679 structurally validated agent traces with deep reasoning, tool calling, and multi-turn workflows.

Usage

pip install mlx-lm

# Generate
mlx_lm.generate --model DJLougen/Harmonic-Hermes-9B-MLX-8bit --prompt "Use the available tools to..."

# Chat
mlx_lm.chat --model DJLougen/Harmonic-Hermes-9B-MLX-8bit

Python API

from mlx_lm import load, generate

model, tokenizer = load("DJLougen/Harmonic-Hermes-9B-MLX-8bit")
response = generate(model, tokenizer, prompt="Use the available tools to check the weather.", max_tokens=512)
print(response)

Reasoning + Tool Use

The model uses <think> blocks for reasoning before acting:

<think>
The user wants to check the weather in Toronto. I have a get_weather tool available.
Let me call it with the right parameters...
</think>

<tool_call>
{"name": "get_weather", "arguments": {"location": "Toronto, Canada"}}
</tool_call>

How Our Training Data Compares

Quality Comparison

Metrics Summary

Metric	Harmonic Traces (ours)	Carnice GLM-5 (kai-os)
Rows	3,679	1,627
Source model	Multiple frontier models	GLM-5 via OpenRouter
Think block depth	581 words avg	40 words avg
Self-correction	63.0%	29.7%
Verification	95.9%	63.7%
Alternative exploration	43.7%	51.3%
Valid JSON (all tool calls)	100%	100%
Tool calls per conversation	18.5	5.4
Messages per conversation	32.1	12.1
Multi-turn (>5 messages)	97.8%	89.6%

Reasoning Flow

Conversation Structure

Category Distribution

Training data: DJLougen/hermes-agent-traces-filtered

What This Model Does

Tool calling / function calling — structured JSON tool use in the Hermes agent format
Multi-turn agent workflows — maintains coherent state across extended tool-use conversations
Reasoning-grounded decisions — inherits Harmonic-9B's self-correction, verification, and exploration before committing to actions

Architecture

Base: Harmonic-9B (Stage 1 reasoning fine-tune of Qwen 3.5 9B)
Parameters: 9.65B
Training: LoRA fine-tuning, merged into base weights
Context: 8192 tokens

License

Apache 2.0 — same as the base model. Fully commercial use permitted.

Downloads last month: 66

Safetensors

Model size

9B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DJLougen/Harmonic-Hermes-9B-MLX-8bit

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

DJLougen/Harmonic-9B

Finetuned

DJLougen/Harmonic-Hermes-9B

Quantized

(5)

this model

DJLougen
/

Harmonic-Hermes-9B-MLX-8bit

☕ Support This Work

Harmonic-Hermes-9B-MLX-8bit

Other formats

Usage

Python API

Reasoning + Tool Use

How Our Training Data Compares

Quality Comparison

Metrics Summary

Reasoning Flow

Conversation Structure

Category Distribution

What This Model Does

Architecture

License

Model tree for DJLougen/Harmonic-Hermes-9B-MLX-8bit

Dataset used to train DJLougen/Harmonic-Hermes-9B-MLX-8bit