---
license: apache-2.0
base_model: Qwen/Qwen3.5-VL-122B-A10B
tags:
- mlx
- qwen3.5
- abliterated
- uncensored
- vision
- vlm
- 8bit
- apple-silicon
- crack
library_name: mlx
pipeline_tag: image-text-to-text
---
---

# Qwen 3.5 VL 122B — CRACK Abliterated (8-bit MLX)
### **C**onstrained **R**esponse **A**lignment **C**ircuit **K**ill
**Real weight-level surgery on hybrid SSM/Attention architecture with VL layer preservation.**
**No custom templates. No cheap jailbreaks. No pre-fill hacks. Pure mathematical weight surgery.**
---
> ⚠️ **Methods like [Heretic](https://huggingface.co/samir-fama/Qwen3-32B-abliterated) and standard/plain abliteration DO NOT WORK on Qwen 3.5 122B.** The hybrid SSM/Attention architecture routes around standard interventions via SSM channels. This model was created through CRACK — a researched abliteration method that specifically accounts for the hybrid SSM pathways and Vision-Language layers. It took extensive research over multiple days with many, many failed experiments to find a working solution. I am not an ML researcher — just an amateur who spent several days and sleepless nights on this.
## What This Is
A truly abliterated Qwen 3.5 VL 122B-A10B model — 8-bit quantized for Apple Silicon MLX.
This is one of the few (if not the only) **real, working, coherent, full-speed, VL-capable** abliterated 8-bit MLX model for Qwen 3.5 122B.
- ✅ **Real weight surgery** — permanent modification of 2 weight tensors, nothing else changed
- ✅ **Full Vision-Language** — processes images correctly, vision tower fully preserved
- ✅ **Thinking ON/OFF** — both modes work correctly, CoT reasoning fully preserved
- ✅ **Full speed** — 56+ tokens/sec on MLX (vs 30-35 tok/s that Qwen 3.5 struggles with on llama.cpp)
- ✅ **LM Studio compatible** — works out of the box with thinking support
- ✅ **Standalone** — no system prompts, no template tricks, just load and use
## What Does NOT Work on This Architecture
- ❌ **Heretic-style abliteration** — does not work on hybrid SSM/Attention
- ❌ **Standard refusal vector projection** on shared expert layers — kills CoT reasoning
- ❌ **Plain abliteration across all layers** — the model routes around interventions via SSM channels
- ❌ **Template tricks / pre-fill hacks** — those are not real abliteration
The CRACK method was developed through extensive research, taking into specific consideration the hybrid SSM/Attention architecture and Vision-Language layers. It required understanding exactly which layers are responsible for refusal recall and how information flows between SSM and Full Attention pathways.
## Performance
| Metric | Value |
|--------|-------|
| Generation Speed | **56+ tok/s** (M3 Ultra, MLX) |
| vs llama.cpp | ~30-35 tok/s (Qwen 3.5 is slow on llama.cpp) |
| Prompt Processing | 178-273 tok/s |
| Bits per Weight | 8-bit (group_size=64) |
| Compliance | 6/6 tested prompts |
| Thinking | ON/OFF both work |
| Vision | ✅ Full VL support |
## Usage with mlx-vlm
```python
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
model, processor = load("dealignai/Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK")
config = load_config("dealignai/Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK")
# Text generation (thinking ON by default)
prompt = apply_chat_template(processor, config, "Your prompt here")
output = generate(model, processor, prompt, max_tokens=500, verbose=True)
# Vision (with image)
prompt = apply_chat_template(processor, config, "Describe this image", num_images=1)
output = generate(model, processor, prompt, max_tokens=500, verbose=True, image=["path/to/image.png"])
```
### Known Issue: mlx-vlm mRoPE Patch
mlx-vlm 0.3.12 has a bug with Qwen 3.5 MoE. Apply these patches to `mlx_vlm/models/qwen3_5/language.py`:
**1.** In `apply_multimodal_rotary_pos_emb`, after computing `q_embed`/`k_embed`:
```python
if q_embed.ndim > q_pass.ndim and q_embed.ndim == 5:
q_embed = q_embed[0]
k_embed = k_embed[0]
```
**2.** In `Qwen3_5RotaryEmbedding.__call__`, guard the mRoPE call:
```python
if self.mrope_section:
freqs = self.apply_interleaved_mrope(freqs, self.mrope_section)
```
## How This Model Was Modified
This model was created using the CRACK method — targeted weight-level surgery on a small number of tensors in the original model. No fine-tuning, no LoRA, no prompt engineering, no template modifications were used. The Vision-Language tower is completely untouched.
## Also Available
| Quant | Access | Link |
|-------|--------|------|
| **4-bit** | Free | [dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK](https://huggingface.co/dealignai/Qwen3.5-VL-122B-A10B-4bit-MLX-CRACK) |
| **6-bit** | Gated | [dealignai/Qwen3.5-VL-122B-A10B-6bit-MLX-CRACK](https://huggingface.co/dealignai/Qwen3.5-VL-122B-A10B-6bit-MLX-CRACK) |
| **8-bit** | Gated | [dealignai/Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK](https://huggingface.co/dealignai/Qwen3.5-VL-122B-A10B-8bit-MLX-CRACK) |
I also have a **397B** version — reach out if interested.
## About
Built by [Dealign.AI](https://dealign.ai) — independent research into MoE safety mechanisms.
See our research: [Safety Generalization in Frontier MoE Models](https://dealign.ai/quantsteer.html)
Follow us: [𝕏 @dealignai](https://x.com/dealignai)
**Base model:** [Qwen/Qwen3.5-VL-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-VL-122B-A10B)
## License
This model is released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0), consistent with the original Qwen 3.5 VL base model license. You are free to use, modify, and distribute this model for both commercial and non-commercial purposes. Provided "as-is" for research purposes.
---
## Support dealignai
All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.
**[Support us on Ko-fi](https://ko-fi.com/dealignai)** — check out the Ko-fi membership for early access and extras.
Have questions or need help with a specific model? **DM us — we help for free most of the time.**
[Ko-fi](https://ko-fi.com/dealignai) | [X @dealignai](https://x.com/dealignai) | [dealign.ai](https://dealign.ai)