Qwen3.5-27B — UD-Q3_K_XL (mlx-node)

3-bit base mixed-precision quantization of Qwen/Qwen3.5-27B for Apple Silicon, using the Unsloth Dynamic quantization strategy via mlx-node.

	Original (BF16)	This Model
Size	~51 GB	17 GB
Format	SafeTensors (sharded)	SafeTensors (single file)
Precision	BF16 uniform	Mixed 3/4/5/6/8-bit + BF16

All Variants

Repo	GGUF Equivalent	Size
Brooooooklyn/Qwen3.5-27B-UD-Q2_K_XL-mlx	UD-Q2_K_XL	15 GB
Brooooooklyn/Qwen3.5-27B-UD-Q3_K_XL-mlx	UD-Q3_K_XL	17 GB
Brooooooklyn/Qwen3.5-27B-UD-Q4_K_XL-mlx	UD-Q4_K_XL	20 GB
Brooooooklyn/Qwen3.5-27B-UD-Q5_K_XL-mlx	UD-Q5_K_XL	24 GB
Brooooooklyn/Qwen3.5-27B-UD-Q6_K_XL-mlx	UD-Q6_K_XL	27 GB
Brooooooklyn/Qwen3.5-27B-UD-Q8_K_XL-mlx	UD-Q8_K_XL	29 GB

Per-Tensor Bit Assignments (N=3)

Weight	Bits	Rationale
`embed_tokens`	5-bit	KLD ~0.15 — very low sensitivity
`lm_head`	6-bit	KLD ~0.05 — safest tensor
`self_attn.q/k/v_proj`	5-bit + AWQ	KLD ~1.5-2.9, AWQ via layernorm
`linear_attn.in_proj_qkv/z`	5-bit + AWQ	KLD ~2.9, AWQ via layernorm
`self_attn.o_proj`	bf16	NOT AWQ-correctable
`linear_attn.out_proj`	bf16	KLD ~6.0 — worst tensor
`down_proj`	4-bit	"Slightly more sensitive"
`gate_proj`, `up_proj`	3-bit	"Generally ok" at low bits

Based on Unsloth Dynamic 2.0 per-tensor KLD analysis with imatrix AWQ pre-scaling.

Usage

import {{ loadModel }} from '@mlx-node/lm';
const model = await loadModel('./Qwen3.5-27B-UD-Q3_K_XL-mlx');
const result = await model.chat(
  [{{ role: 'user', content: 'Hello!' }}],
  {{ maxNewTokens: 2048, temperature: 0.6, enableThinking: false }},
);
console.log(result.text);

How It Was Made

mlx convert -i Qwen3.5-27B -o Qwen3.5-27B-UD-Q3_K_XL-mlx -q --q-bits 3 --q-recipe unsloth --imatrix-path imatrix_unsloth.gguf

Acknowledgments

Unsloth — Per-layer KLD benchmarks and Dynamic 2.0 methodology
Qwen Team — Qwen3.5 model family
Apple MLX — Metal-accelerated ML framework

License

Apache 2.0 (inherited from base model).

Downloads last month: 328

Safetensors

Model size

6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

3-bit

Model tree for Brooooooklyn/Qwen3.5-27B-UD-Q3_K_XL-mlx

Base model

Qwen/Qwen3.5-27B

Quantized

(197)

this model

Collection including Brooooooklyn/Qwen3.5-27B-UD-Q3_K_XL-mlx

Qwen-3.5-unsloth-mlx

Collection

AWQ-style pre-scaling using Unsloth's imatrix calibration data, then 3-6-bit affine quantization with the Unsloth mixed-precision recipe via MLX • 20 items • Updated 19 days ago • 20