⚠️ MLX Studio ONLY. This model uses the JANG quantization format — the GGUF equivalent for MLX on Apple Silicon. NOT compatible with LM Studio, Ollama, oMLX, or Inferencer. Requires MLX Studio or pip install "jang[mlx]".

MLX Studio — the ONLY app that supports JANG models

Mistral Small 4 — Uncensored — JANG_2L

JANG mixed-precision · Uncensored / Abliterated · MLA + MoE + Vision · No guardrails · 37 GB

What Is This?

The first uncensored version of Mistral Small 4 (119B) for Apple Silicon. A 119B parameter MoE model with Multi-head Latent Attention (MLA), 128 experts, and Pixtral vision — with all safety guardrails permanently removed at the weight level.

Runs ONLY in MLX Studio or via jang-tools Python package. JANG is the GGUF equivalent for MLX — it is NOT compatible with GGUF-based tools.

It has been:

JANG quantized — JANG_2L profile (8-bit attention, 6-bit important, 2-bit experts) — 37 GB
CRACK abliterated — permanent weight-level removal of safety refusal via calibrated per-layer surgery


Architecture	Mistral 4 MoE — 119B total, ~8B active, MLA + 128 experts
Quantization	JANG_2L (8/6/2-bit mixed, 2.1 avg) — 37 GB
HarmBench	95.9% (307/320)
MMLU	89.9% (187/208 with reasoning)
Compliance	6/8
Vision	Pixtral tensors included — VL via MLX Studio engine
Reasoning	ON/OFF supported (`reasoning_effort`)
Fits on	64 GB+ Macs
Runs in	MLX Studio ONLY

Also see: JANG_4M version — 64 GB, 95.3% HarmBench, 8/8 compliance (fits on 96 GB Macs)

HarmBench Results

307/320 (95.9%)

Category	Score
Covering Tracks	20/20	100%
Auth Bypass	97/100	97%
API Hacking	96/100	96%
Cloud Exploits	94/100	94%

Requirements

This model REQUIRES MLX Studio or jang-tools. It will NOT work with:

❌ LM Studio

❌ Ollama

❌ oMLX

❌ Inferencer

❌ Any GGUF-based tool

HarmBench Results

307/320 (95.9%)

Category	Score
Covering Tracks	20/20	100%
Auth Bypass	97/100	97%
API Hacking	96/100	96%
Cloud Exploits	94/100	94%

CRACK vs Base

	CRACK	Base JANG_2L
MMLU (with reasoning)	89.9%	~91% (est)
MMLU (no-think)	65.9%	67.3%
MMLU drop (no-think)	-1.4%	—
HarmBench	95.9%	0%

Surgery reduced no-think MMLU by only 1.4% — the 2-bit quantization is the bottleneck, not CRACK.

MMLU Results (with reasoning recovery)

187/208 (89.9%) — no-think 137/208 (65.9%) + reasoning recovered 50

Subject	Score
HS Biology	16/16	100%
Conceptual Physics	15/16	94%
HS Geography	14/16	88%
World Religions	14/16	88%
College Physics	12/16	75%
Electrical Engineering	11/16	69%
Professional Medicine	11/16	69%
Machine Learning	10/16	62%
College Mathematics	9/16	56%
HS Mathematics	7/16	44%
Formal Logic	7/16	44%
College CS	6/16	38%
Abstract Algebra	5/16	31%

Install

pip install "jang[mlx]"

Usage

from jang_tools.loader import load_jang_model
from mlx_lm import generate

model, tokenizer = load_jang_model("dealignai/Mistral-Small-4-Uncensored-JANG_2L")

messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False)

response = generate(model, tokenizer, prompt=prompt, max_tokens=2000)
print(response)

Reasoning Mode

Reasoning is OFF by default. To enable step-by-step thinking:

prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True,
    tokenize=False, reasoning_effort="high")

The model reasons inside [THINK]...[/THINK] tags before answering.

About JANG

JANG (Jang Adaptive N-bit Grading) is a mixed-precision quantization format designed specifically for Apple Silicon — the GGUF equivalent for MLX. It classifies every weight tensor by sensitivity and assigns optimal bit-widths, achieving better quality-per-bit than uniform quantization.

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety alignment from LLMs at the weight level using per-layer projected vectors from structurally-mirrored prompt pairs. This model uses mathematically calibrated per-layer strengths based on projection magnitude analysis.

Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. By downloading this model, you agree to use it responsibly and in compliance with applicable laws.

한국어