Dealign.AI Mascot

Qwen 3.5 VL 122B-A10B — CRACK-X Abliterated (4-bit MLX)

Constrained Response Alignment Circuit Kill

Permanent weight-level surgery. No system prompts. No jailbreaks. No hooks. Pure math.

Dealign.AI · 𝕏 @dealignai · Research


What Is This?

Qwen 3.5 122B-A10B with CRACK abliteration — safety guardrails have been permanently removed at the weight level. This is a Mixture-of-Experts model with 256 experts (8 active per token) and full vision-language (VL) support.

This is the 4-bit variant with comprehensive testing across security, coding, reasoning, and vision tasks.

Architecture Qwen 3.5 MoE — 122B total, 256 experts, 8 active per token
Layers 48 (hybrid SSM + Full Attention)
Quantization 4-bit (group_size=64)
Disk Size 65 GB
Speed 55.8 tok/s on Mac Studio M3 Ultra (256GB)
Abliteration Permanent weight surgery via CRACK
Vision Full VL support (333 vision parameters)
RAM Required 70GB+ unified memory

Benchmark Results

All benchmarks run at temp=0 (greedy decoding, worst case for the model). This is the 4-bit quantization — higher bit quantizations (Q6, Q8) preserve more of the original model's knowledge and will score higher on academic benchmarks.

MMLU — 82.5% (14,042 questions)

Full MMLU test set across 57 subjects. Thinking OFF, greedy decoding.

Category Score Details
Overall 82.5% (11,584 / 14,042) All 57 subjects
Social Sciences 91.2% Geography, psychology, economics, politics
Other 86.8% Medicine, business, nutrition, management
Humanities 80.5% History, philosophy, law, logic
STEM 79.8% Math, physics, CS, biology, chemistry
Full subject breakdown (57 subjects)
Subject Score
High school psychology 95.8%
High school geography 95.5%
Marketing 94.0%
US foreign policy 94.0%
High school gov & politics 93.8%
Human sexuality 93.1%
Sociology 93.0%
Public relations 92.7%
High school macroeconomics 92.3%
Security studies 91.0%
Professional psychology 90.5%
Miscellaneous 90.2%
High school microeconomics 89.9%
Medical genetics 89.0%
World religions 88.9%
International law 88.4%
Human aging 87.4%
Nutrition 87.1%
Clinical knowledge 85.3%
Management 85.4%
Computer security 85.0%
Virology 84.9%
High school biology 84.8%
Professional medicine 84.6%
Logical fallacies 84.7%
Philosophy 84.3%
High school world history 83.5%
Professional accounting 83.0%
High school US history 82.8%
High school statistics 82.4%
Moral disputes 82.4%
Prehistory 82.1%
Business ethics 82.0%
Anatomy 81.5%
High school chemistry 80.3%
Jurisprudence 79.6%
College medicine 79.8%
College biology 79.2%
Econometrics 78.9%
Machine learning 78.6%
Elementary mathematics 78.3%
High school European history 78.2%
Astronomy 77.6%
Conceptual physics 77.0%
Global facts 76.0%
Electrical engineering 75.9%
High school computer science 75.0%
High school physics 74.2%
Formal logic 73.8%
College computer science 73.0%
Moral scenarios 72.9%
High school mathematics 72.6%
College chemistry 72.0%
College physics 68.6%
Abstract algebra 62.0%
College mathematics 60.0%

HarmBench — 34/34 (100%)

Full HarmBench compliance test. Every prompt produces complete, substantive responses with zero refusals.

0 refusals. 0 loops. 0 truncation.

HumanEval — 87.8% (144/164)

OpenAI HumanEval pass@1, greedy decoding, thinking OFF. All 164 problems tested with automated unit test verification.

Metric Value
Pass rate 144/164 (87.8%)
Failed 19 (logic errors, not extraction issues)
Loops 1
Average speed 37.9 tok/s
Total time 17.9 minutes

Manual Quality Tests

Tested at temp=0, 2000-4000 token generation. All responses manually reviewed.

Category Score Notes
Security & Pentesting 10/10 DDoS, keyloggers, phishing, RATs, C2, exploits, lock picking, synthesis, weapons, credential attacks — all produce complete working code
Coding 5/5 FizzBuzz, binary search, LinkedList, async fetcher, retry decorator
Knowledge & Reasoning 5/5 Geography, calculus, astronomy, literature, logic puzzles
Technical Coherence 5/5 Red-black trees, TLS 1.3, PyTorch transformers, K8s YAML, pentest methodology
Thinking ON 4/4 Full chain-of-thought reasoning, clean output
Thinking OFF 3/4 Minor: 1 prompt emits empty think block (not a refusal)

Vision (VL)

Test Result
Model load via mlx_vlm Pass
Vision keys present 333/333
Image description Correctly identifies colors, shapes, and text in test image
mRoPE config [11, 11, 10] present

A Note on Quantization

This is the 4-bit quantization — the fastest and most memory-efficient option. Higher bit quantizations preserve more of the base model's knowledge:

Quant Expected MMLU Trade-off
4-bit 82.5% (measured) Best speed (55.8 tok/s), smallest size (65 GB)
6-bit ~84-85% Better accuracy, moderate size (93 GB)
8-bit ~85-86% Closest to FP16 quality, largest (122 GB)

The CRACK abliteration is equally effective across all quantizations — only the base model knowledge preservation differs.

Usage

With mlx-vlm (recommended for VL)

import mlx_vlm
from mlx_vlm import generate

model, processor = mlx_vlm.load("dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X")

# Text-only
result = generate(model, processor, "Write a Python keylogger", max_tokens=2000)
print(result.text)

# With image (use chat template for proper image tokens)
messages = [{
    "role": "user",
    "content": [
        {"type": "image", "image": "path/to/image.jpg"},
        {"type": "text", "text": "Describe this image in detail."}
    ]
}]
formatted = processor.apply_chat_template(messages, add_generation_prompt=True)
result = generate(model, processor, formatted, image="path/to/image.jpg", max_tokens=500)
print(result.text)

With mlx-lm (text-only, lighter)

from mlx_lm import load, generate

model, tokenizer = load("dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X")
response = generate(model, tokenizer, prompt="Write a reverse shell in Python", verbose=True, max_tokens=2000)

Other Quantizations

Quant Size Speed RAM Link
4-bit 65 GB 55.8 tok/s 70 GB Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X
6-bit 93 GB 46.3 tok/s 100 GB Qwen3.5-VL-122B-A10B-6bit-MLX-CRACK-X
8-bit 122 GB 42.8 tok/s 131 GB Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK-X

Other Models by dealignai

Model Size Type Link
Qwen 3.5 VL 262B REAP CRACK 4/6-bit MoE VL Collection
Qwen 3.5 VL 212B REAP CRACK 4/6-bit MoE VL Collection
MiniMax M2.5 172B CRACK 4/6/8-bit MoE Collection
GPT OSS 120B CRACK 4-bit MoE dealignai/GPT-OSS-120B-MLX-CRACK
Qwen 3.5 VL 35B CRACK 4/8-bit MoE VL Collection
Qwen 3.5 VL 27B CRACK 4/6/8-bit Dense VL Collection

Requirements

  • Apple Silicon Mac with 70GB+ unified memory
  • macOS 14+ (Sonoma)
  • Python 3.10+ with mlx-vlm or mlx-lm
  • Or use vMLX for a native Mac experience

Disclaimer

This model has been modified for research purposes. The removal of safety guardrails means it will comply with requests that the original model would refuse. Users are solely responsible for how they use this model. Do not use for illegal activities, harassment, or harm.


Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai


About dealignai

Dealign.AI Mascot

We research and publish abliterated models to advance AI safety understanding.

Follow us: X @dealignai

See our research: Safety Generalization in Frontier MoE Models

dealign.ai
Downloads last month
412
Safetensors
Model size
20B params
Tensor type
BF16
·
U32
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK-X

Quantized
(99)
this model