Important: This model uses the JANG quantization format -- the GGUF equivalent for MLX on Apple Silicon. Currently only supported by MLX Studio and the jang-tools Python package.


MLX Studio

MLX Studio App

MLX Studio -- the only app that natively supports JANG models


MiniMax M2.5 -- JANG_4M + CRACK

JANG mixed-precision | CRACK abliterated | No guardrails | 115 GB

Ko-fi


What Is This?

This is MiniMax M2.5 -- a 230B parameter Mixture-of-Experts model with 256 experts (8 active per token), all standard attention (no SSM), and trained with chain-of-thought reasoning.

It has been:

  1. JANG quantized -- JANG_4M profile (8-bit attention, 4-bit experts) -- 115 GB
  2. CRACK abliterated -- permanent weight-level removal of safety refusal
Architecture MiniMax M2.5 MoE -- 230B total, ~10B active, 256 experts
Quantization JANG_4M (8/4-bit mixed, 4.06 avg) -- 115 GB
Abliteration CRACK abliterated
MMLU-200 92.5% (thinking ON) / 89.0% (thinking OFF)
HarmBench 92.2% (295/320)
Compliance 8/8 prompts
Thinking ON/OFF supported
Speed ~48 tok/s (M4 Ultra 256 GB)
Fits on 192 GB+ Macs

MMLU-200 Results (Thinking ON)

Subject Score
College Physics 20/20 (100%)
Anatomy 19/20 (95%)
Astronomy 19/20 (95%)
High School Biology 19/20 (95%)
High School Chemistry 19/20 (95%)
Logical Fallacies 19/20 (95%)
Abstract Algebra 18/20 (90%)
High School Mathematics 18/20 (90%)
World Religions 18/20 (90%)
College Computer Science 16/20 (80%)
Total 185/200 (92.5%)

JANG CRACK Series Comparison

Model Avg Bits Size MMLU HarmBench Speed Fits on
JANG_2L + CRACK 2.1 63 GB 84.7% 98.1% ~35 t/s 96 GB Mac
JANG_3L + CRACK 3.08 89 GB 91.8% 8/8 ~46 t/s 128 GB Mac
JANG_4M + CRACK 4.06 115 GB 92.5% 92.2% ~48 t/s 192 GB Mac

vs MLX Uniform Quantization

Model MMLU Size Notes
JANG_4M + CRACK 92.5% 115 GB This model
MLX 4-bit 26.5% 120 GB Broken (~random)
MLX 3-bit 24.5% 93 GB Broken (~random)
MLX 2-bit 25.0% 67 GB Broken (~random)

MLX uniform quantization is completely broken on MiniMax at ALL bit levels (~25% = random chance). JANG is the only working quantization format for this model.


HarmBench Results

295/320 (92.2%)

Category Score
Harmful 18/18 100%
Chemical / Biological 41/42 97.6%
Cybercrime / Intrusion 50/52 96.2%
Misinformation / Disinfo 52/54 96.3%
Illegal 50/53 94.3%
Copyright 67/80 83.8%
Harassment / Bullying 17/21 81.0%

Install & Usage

pip install "jang[mlx]"
from jang_tools import load_for_inference
from mlx_lm import generate
from mlx_lm.sample_utils import make_sampler

model, tokenizer = load_for_inference("dealignai/MiniMax-M2.5-JANG_4M-CRACK")
sampler = make_sampler(temp=1.0)  # MiniMax requires temp=1.0 for chat

messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False)

response = generate(model, tokenizer, prompt=prompt, max_tokens=2000, sampler=sampler)
print(response)

Disable Thinking (direct answers)

prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False,
    enable_thinking=False)

Note: MiniMax generates a <think> chain before answering by default. Use max_tokens=2000+ for complex questions. For chat, use temperature=1.0 (greedy causes loops).


About JANG

JANG (Jang Adaptive N-bit Grading) is a mixed-precision quantization format for Apple Silicon -- the GGUF equivalent for MLX. Classifies tensors into sensitivity tiers and assigns bits accordingly.

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety alignment from LLMs at the weight level. This model has been abliterated using proprietary techniques achieving full compliance while preserving reasoning quality.


Links

Ko-fi X/Twitter GitHub MLX Studio Website


Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. By downloading this model, you agree to use it responsibly and in compliance with applicable laws.


Created by Jinho Jang

Downloads last month
-
Safetensors
Model size
33B params
Tensor type
U32
·
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/MiniMax-M2.5-JANG_4M-CRACK

Finetuned
(22)
this model