OsaurusAI/DeepSeek-V4-Flash-JANGTQ2

⚠️ REQUIRED — jangtq_runtime.safetensors sidecar must be downloaded

Osaurus uses the native Swift JANGTQ runtime. Every JANGTQ bundle on OsaurusAI ships a small jangtq_runtime.safetensors sidecar (~~10 KB–~~165 KB) alongside the weight shards. The Swift loader will refuse to start with the error
Error: Model '<name>' declares JANGTQ (weight_format: "mxtq") but is
       missing required sidecar file 'jangtq_runtime.safetensors'.
       Re-download the full model or obtain the sidecar from the original
       publisher.
if the file is absent.

If your local copy doesn't have it (older download, partial sync, etc):
hf download OsaurusAI/DeepSeek-V4-Flash-JANGTQ2 jangtq_runtime.safetensors --local-dir <your-dir>
The file holds the deterministic codebooks + Hadamard rotation signs the Swift loader uses to decode *.tq_packed weights. It must match the seed the bundle was quantized with (mxtq_seed=42).

DeepSeek-V4-Flash — JANGTQ2 (MLX, uniform 2-bit MXTQ baseline)

Canonical 2-bit TurboQuant MXTQ baseline — uniform across all routed experts. Simpler recipe than JANGTQ premium. 79.6 GB. 22.3 tok/s.

Model Details

Property	Value
Base model	`deepseek-ai/DeepSeek-V4-Flash`
Parameters	671 B total, 37 B active per token
Architecture	DeepseekV4 — MLA + multi-head causal residual + Compressor/Indexer long-ctx
Codec	TurboQuant MXTQ (Lloyd-Max codebook + Hadamard rotation)
Quantization plan	Uniform 2-bit MXTQ for all routed experts, 8-bit affine gs=32 for non-routed
Runtime	`jang_tools.load_jangtq` + `mlx_lm.generate`
Bundle size	79.6 GB
Decode	22.34 tok/s sustained on Mac Studio M3 Ultra (200-token greedy)
MMLU 200q (logit, fair seed)	70.00%

Recipe

Tensor class	Bits	Codec
Routed experts (all 256 × 43 layers, uniform)	2-bit	MXTQ codebook
Attention (`wq_a`/`wq_b`/`wkv`/`wo_a`/`wo_b`)	8-bit	affine gs=32
Shared experts	8-bit	affine gs=32
Compressor + Indexer (long-ctx)	8-bit	affine gs=32
`embed_tokens`, `lm_head`	8-bit	affine gs=32
Norms / router gate / mHC	fp16	passthrough

vs JANGTQ (premium): JANGTQ has per-importance plan (hash-routed L0-L2 at 4-bit MXTQ). JANGTQ2 is uniform 2-bit MXTQ — simpler, smaller risk surface.

Use

import os
os.environ["JANG_WIRED_LIMIT_GB"] = "160"  # Mac Studio M3 Ultra
# Long context (optional):
# os.environ["VMLX_DSV4_LONG_CTX"] = "1"

import mlx.core as mx
from jang_tools.load_jangtq import load_jangtq_model
from mlx_lm.generate import generate

model, tok = load_jangtq_model("OsaurusAI/DeepSeek-V4-Flash-JANGTQ2")

text = tok.apply_chat_template(
    [{"role": "user", "content": "What is 2+2?"}],
    tokenize=False, add_generation_prompt=True,
)
print(generate(model, tok, prompt=text, max_tokens=200, verbose=True))

Bundle comparison (DeepSeek-V4-Flash family, MMLU 200q logit, fair seed)

Bundle	Size	MMLU 200q	Tok/s
DeepSeek-V4-Flash-JANGTQ (premium)	79 GB	69.50%	25.91
DeepSeek-V4-Flash-JANGTQ2 (this)	79.6 GB	70.00%	22.34
DeepSeek-V4-Flash-JANG_2L	107 GB	71.50%	23.77
mlx-community/DeepSeek-V4-Flash-2bit-DQ	90 GB	50.00%	36.03

HumanEval+ pass@1

Coming soon — comprehensive pass@1 in flight.

Credits

Created by Jinho Jang — eric@jangq.ai

Built on top of DeepSeek-V4-Flash (deepseek-ai).

Distributed via Osaurus AI.

Downloads last month: 2,148

Safetensors

Model size

20B params

Tensor type

U32

I32

F16

I64

MLX

Hardware compatibility

2-bit

Model tree for OsaurusAI/DeepSeek-V4-Flash-JANGTQ2

Base model

deepseek-ai/DeepSeek-V4-Flash

Finetuned

(11)

this model

OsaurusAI
/

DeepSeek-V4-Flash-JANGTQ2

⚠️ REQUIRED — `jangtq_runtime.safetensors` sidecar must be downloaded

DeepSeek-V4-Flash — JANGTQ2 (MLX, uniform 2-bit MXTQ baseline)

Model Details

Recipe

Use

Bundle comparison (DeepSeek-V4-Flash family, MMLU 200q logit, fair seed)

HumanEval+ pass@1

Credits

Model tree for OsaurusAI/DeepSeek-V4-Flash-JANGTQ2

⚠️ REQUIRED — jangtq_runtime.safetensors sidecar must be downloaded

DeepSeek-V4-Flash — JANGTQ2 (MLX, uniform 2-bit MXTQ baseline)

Model Details

Recipe

Use

Bundle comparison (DeepSeek-V4-Flash family, MMLU 200q logit, fair seed)

HumanEval+ pass@1

Credits

Model tree for OsaurusAI/DeepSeek-V4-Flash-JANGTQ2

⚠️ REQUIRED — `jangtq_runtime.safetensors` sidecar must be downloaded