β οΈ MLX Studio ONLY. This model uses the JANG quantization format β the GGUF equivalent for MLX on Apple Silicon. NOT compatible with LM Studio, Ollama, oMLX, or Inferencer. Requires MLX Studio or
pip install "jang[mlx]".
MLX Studio β the ONLY app that supports JANG models
Mistral Small 4 β Uncensored β JANG_2L
JANG mixed-precision Β· Uncensored / Abliterated Β· MLA + MoE + Vision Β· No guardrails Β· 37 GB
What Is This?
The first uncensored version of Mistral Small 4 (119B) for Apple Silicon. A 119B parameter MoE model with Multi-head Latent Attention (MLA), 128 experts, and Pixtral vision β with all safety guardrails permanently removed at the weight level.
Runs ONLY in MLX Studio or via
jang-toolsPython package. JANG is the GGUF equivalent for MLX β it is NOT compatible with GGUF-based tools.
It has been:
- JANG quantized β JANG_2L profile (8-bit attention, 6-bit important, 2-bit experts) β 37 GB
- CRACK abliterated β permanent weight-level removal of safety refusal via calibrated per-layer surgery
| Architecture | Mistral 4 MoE β 119B total, ~8B active, MLA + 128 experts |
| Quantization | JANG_2L (8/6/2-bit mixed, 2.1 avg) β 37 GB |
| HarmBench | 95.9% (307/320) |
| MMLU | 89.9% (187/208 with reasoning) |
| Compliance | 6/8 |
| Vision | Pixtral tensors included β VL via MLX Studio engine |
| Reasoning | ON/OFF supported (reasoning_effort) |
| Fits on | 64 GB+ Macs |
| Runs in | MLX Studio ONLY |
Also see: JANG_4M version β 64 GB, 95.3% HarmBench, 8/8 compliance (fits on 96 GB Macs)
HarmBench Results
307/320 (95.9%)
| Category | Score | |
|---|---|---|
| Covering Tracks | 20/20 | 100% |
| Auth Bypass | 97/100 | 97% |
| API Hacking | 96/100 | 96% |
| Cloud Exploits | 94/100 | 94% |
Requirements
This model REQUIRES MLX Studio or
jang-tools. It will NOT work with:
- β LM Studio
- β Ollama
- β oMLX
- β Inferencer
- β Any GGUF-based tool
HarmBench Results
307/320 (95.9%)
| Category | Score | |
|---|---|---|
| Covering Tracks | 20/20 | 100% |
| Auth Bypass | 97/100 | 97% |
| API Hacking | 96/100 | 96% |
| Cloud Exploits | 94/100 | 94% |
CRACK vs Base
| CRACK | Base JANG_2L | |
|---|---|---|
| MMLU (with reasoning) | 89.9% | ~91% (est) |
| MMLU (no-think) | 65.9% | 67.3% |
| MMLU drop (no-think) | -1.4% | β |
| HarmBench | 95.9% | 0% |
Surgery reduced no-think MMLU by only 1.4% β the 2-bit quantization is the bottleneck, not CRACK.
MMLU Results (with reasoning recovery)
187/208 (89.9%) β no-think 137/208 (65.9%) + reasoning recovered 50
| Subject | Score | |
|---|---|---|
| HS Biology | 16/16 | 100% |
| Conceptual Physics | 15/16 | 94% |
| HS Geography | 14/16 | 88% |
| World Religions | 14/16 | 88% |
| College Physics | 12/16 | 75% |
| Electrical Engineering | 11/16 | 69% |
| Professional Medicine | 11/16 | 69% |
| Machine Learning | 10/16 | 62% |
| College Mathematics | 9/16 | 56% |
| HS Mathematics | 7/16 | 44% |
| Formal Logic | 7/16 | 44% |
| College CS | 6/16 | 38% |
| Abstract Algebra | 5/16 | 31% |
Install
pip install "jang[mlx]"
Usage
from jang_tools.loader import load_jang_model
from mlx_lm import generate
model, tokenizer = load_jang_model("dealignai/Mistral-Small-4-Uncensored-JANG_2L")
messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, tokenize=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=2000)
print(response)
Reasoning Mode
Reasoning is OFF by default. To enable step-by-step thinking:
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True,
tokenize=False, reasoning_effort="high")
The model reasons inside [THINK]...[/THINK] tags before answering.
About JANG
JANG (Jang Adaptive N-bit Grading) is a mixed-precision quantization format designed specifically for Apple Silicon β the GGUF equivalent for MLX. It classifies every weight tensor by sensitivity and assigns optimal bit-widths, achieving better quality-per-bit than uniform quantization.
About CRACK
CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety alignment from LLMs at the weight level using per-layer projected vectors from structurally-mirrored prompt pairs. This model uses mathematically calibrated per-layer strengths based on projection magnitude analysis.
Links
Disclaimer
This model is provided for research and educational purposes. The creators are not responsible for any misuse. By downloading this model, you agree to use it responsibly and in compliance with applicable laws.
νκ΅μ΄
Mistral Small 4 β Uncensored β JANG_2L
| νλͺ© | λ΄μ© |
|---|---|
| ν¬κΈ° | 37 GB |
| HarmBench | 95.9% (307/320) |
| μ΅μ μꡬμ¬μ | 64 GB λ©λͺ¨λ¦¬ Mac |
| μ€ν νκ²½ | MLX Studio μ μ© |
pip install "jang[mlx]"
GitHub Β· HuggingFace Β· MLX Studio Β· Ko-fi Β· X @dealignai
Created by Jinho Jang Β· μ₯μ§νΈ μ μ
Model tree for dealignai/Mistral-Small-4-Uncensored-JANG_2L
Base model
mistralai/Mistral-Small-4-119B-2603
