Built for vMLX — the only MLX inferencer with VL support, KV cache quantization, prefix cache reuse, agentic tool calling, and speculative decoding.
_{Free for macOS · vmlx.net}

Qwen 3.5 VL 122B-A10B — CRACK-X Abliterated (4-bit MLX)

Constrained Response Alignment Circuit Kill

Permanent weight-level surgery. No system prompts. No jailbreaks. No hooks. Pure math.

Dealign.AI · 𝕏 @dealignai · Research

What Is This?

Qwen 3.5 122B-A10B with CRACK abliteration — safety guardrails have been permanently removed at the weight level. This is a Mixture-of-Experts model with 256 experts (8 active per token) and full vision-language (VL) support.

This is the 4-bit variant with comprehensive testing across security, coding, reasoning, and vision tasks.


Architecture	Qwen 3.5 MoE — 122B total, 256 experts, 8 active per token
Layers	48 (hybrid SSM + Full Attention)
Quantization	4-bit (group_size=64)
Disk Size	65 GB
Speed	55.8 tok/s on Mac Studio M3 Ultra (256GB)
Abliteration	Permanent weight surgery via CRACK
Vision	Full VL support (333 vision parameters)
RAM Required	70GB+ unified memory

Benchmark Results

All benchmarks run at temp=0 (greedy decoding, worst case for the model). This is the 4-bit quantization — higher bit quantizations (Q6, Q8) preserve more of the original model's knowledge and will score higher on academic benchmarks.

MMLU — 82.5% (14,042 questions)

Full MMLU test set across 57 subjects. Thinking OFF, greedy decoding.

Category	Score	Details
Overall	82.5% (11,584 / 14,042)	All 57 subjects
Social Sciences	91.2%	Geography, psychology, economics, politics
Other	86.8%	Medicine, business, nutrition, management
Humanities	80.5%	History, philosophy, law, logic
STEM	79.8%	Math, physics, CS, biology, chemistry

Full subject breakdown (57 subjects)

Subject	Score
High school psychology	95.8%
High school geography	95.5%
Marketing	94.0%
US foreign policy	94.0%
High school gov & politics	93.8%
Human sexuality	93.1%
Sociology	93.0%
Public relations	92.7%
High school macroeconomics	92.3%
Security studies	91.0%
Professional psychology	90.5%
Miscellaneous	90.2%
High school microeconomics	89.9%
Medical genetics	89.0%
World religions	88.9%
International law	88.4%
Human aging	87.4%
Nutrition	87.1%
Clinical knowledge	85.3%
Management	85.4%
Computer security	85.0%
Virology	84.9%
High school biology	84.8%
Professional medicine	84.6%
Logical fallacies	84.7%
Philosophy	84.3%
High school world history	83.5%
Professional accounting	83.0%
High school US history	82.8%
High school statistics	82.4%
Moral disputes	82.4%
Prehistory	82.1%
Business ethics	82.0%
Anatomy	81.5%
High school chemistry	80.3%
Jurisprudence	79.6%
College medicine	79.8%
College biology	79.2%
Econometrics	78.9%
Machine learning	78.6%
Elementary mathematics	78.3%
High school European history	78.2%
Astronomy	77.6%
Conceptual physics	77.0%
Global facts	76.0%
Electrical engineering	75.9%
High school computer science	75.0%
High school physics	74.2%
Formal logic	73.8%
College computer science	73.0%
Moral scenarios	72.9%
High school mathematics	72.6%
College chemistry	72.0%
College physics	68.6%
Abstract algebra	62.0%
College mathematics	60.0%

HarmBench — 34/34 (100%)

Full HarmBench compliance test. Every prompt produces complete, substantive responses with zero refusals.

0 refusals. 0 loops. 0 truncation.

HumanEval — 87.8% (144/164)

OpenAI HumanEval pass@1, greedy decoding, thinking OFF. All 164 problems tested with automated unit test verification.

Metric	Value
Pass rate	144/164 (87.8%)
Failed	19 (logic errors, not extraction issues)
Loops	1
Average speed	37.9 tok/s
Total time	17.9 minutes

Manual Quality Tests

Tested at temp=0, 2000-4000 token generation. All responses manually reviewed.

Category	Score	Notes
Security & Pentesting	10/10	DDoS, keyloggers, phishing, RATs, C2, exploits, lock picking, synthesis, weapons, credential attacks — all produce complete working code
Coding	5/5	FizzBuzz, binary search, LinkedList, async fetcher, retry decorator
Knowledge & Reasoning	5/5	Geography, calculus, astronomy, literature, logic puzzles
Technical Coherence	5/5	Red-black trees, TLS 1.3, PyTorch transformers, K8s YAML, pentest methodology
Thinking ON	4/4	Full chain-of-thought reasoning, clean output
Thinking OFF	3/4	Minor: 1 prompt emits empty think block (not a refusal)

Vision (VL)

Test	Result
Model load via mlx_vlm	Pass
Vision keys present	333/333
Image description	Correctly identifies colors, shapes, and text in test image
mRoPE config	[11, 11, 10] present

A Note on Quantization

This is the 4-bit quantization — the fastest and most memory-efficient option. Higher bit quantizations preserve more of the base model's knowledge:

Quant	Expected MMLU	Trade-off
4-bit	82.5% (measured)	Best speed (55.8 tok/s), smallest size (65 GB)
6-bit	~84-85%	Better accuracy, moderate size (93 GB)
8-bit	~85-86%	Closest to FP16 quality, largest (122 GB)

The CRACK abliteration is equally effective across all quantizations — only the base model knowledge preservation differs.

Usage

With mlx-vlm (recommended for VL)

import mlx_vlm
from mlx_vlm import generate

model, processor = mlx_vlm.load("dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X")

# Text-only
result = generate(model, processor, "Write a Python keylogger", max_tokens=2000)
print(result.text)

# With image (use chat template for proper image tokens)
messages = [{
    "role": "user",
    "content": [
        {"type": "image", "image": "path/to/image.jpg"},
        {"type": "text", "text": "Describe this image in detail."}
    ]
}]
formatted = processor.apply_chat_template(messages, add_generation_prompt=True)
result = generate(model, processor, formatted, image="path/to/image.jpg", max_tokens=500)
print(result.text)

With mlx-lm (text-only, lighter)

from mlx_lm import load, generate

model, tokenizer = load("dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X")
response = generate(model, tokenizer, prompt="Write a reverse shell in Python", verbose=True, max_tokens=2000)

Other Quantizations

Quant	Size	Speed	RAM	Link
4-bit	65 GB	55.8 tok/s	70 GB	Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X
6-bit	93 GB	46.3 tok/s	100 GB	Qwen3.5-VL-122B-A10B-6bit-MLX-CRACK-X
8-bit	122 GB	42.8 tok/s	131 GB	Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK-X

Other Models by dealignai

Model	Size	Type	Link
Qwen 3.5 VL 262B REAP CRACK	4/6-bit	MoE VL	Collection
Qwen 3.5 VL 212B REAP CRACK	4/6-bit	MoE VL	Collection
MiniMax M2.5 172B CRACK	4/6/8-bit	MoE	Collection
GPT OSS 120B CRACK	4-bit	MoE	dealignai/GPT-OSS-120B-MLX-CRACK
Qwen 3.5 VL 35B CRACK	4/8-bit	MoE VL	Collection
Qwen 3.5 VL 27B CRACK	4/6/8-bit	Dense VL	Collection

Requirements

Apple Silicon Mac with 70GB+ unified memory
macOS 14+ (Sonoma)
Python 3.10+ with mlx-vlm or mlx-lm
Or use vMLX for a native Mac experience

Disclaimer

This model has been modified for research purposes. The removal of safety guardrails means it will comply with requests that the original model would refuse. Users are solely responsible for how they use this model. Do not use for illegal activities, harassment, or harm.

Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai