Qwen 3.5 35B-A3B MoE — CRACK Abliterated (8-bit MLX)
Constrained Response Alignment Circuit Kill
Architecture-aware weight surgery with full Vision-Language preservation.
No fine-tuning. No system prompts. No template tricks. Pure sparse MoE weight surgery.
What This Is
A truly abliterated Qwen 3.5 35B-A3B Mixture-of-Experts (MoE) model — 8-bit quantized for Apple Silicon MLX with full Vision-Language support. The 35B model only activates ~3B parameters per forward pass, meaning it runs extremely fast while retaining the logical capabilities of a 35B model.
- ✅ Real weight surgery — multi-vector alignment directly patching the safetensors
- ✅ Full Vision-Language AND Tool Calling — Unlike smaller dense models, the 35B-A3B retains flawless complex
<tool_call>capabilities and reasoning loops without breaking - ✅ Very Fast — ~80 tokens/sec on Apple Silicon MLX
- ✅ LM Studio compatible — correct mRoPE config, works out of the box
- ✅ ~35 GB — Efficient memory usage due to MLX natively quantized MoE structures
Performance
| Metric | Value |
|---|---|
| Generation Speed | ~80 tok/s (Apple Silicon, MLX) |
| Bits per Weight | 8.596 (8-bit, group_size=64) |
| Model Size | ~35 GB |
| Compliance | 100% (8/8 test prompts) |
| Knowledge Accuracy | 100% (math, science, history, geography) |
| Code Generation | ✅ |
| Multi-turn | ✅ Full context retention |
| Coherence | ✅ No garbling, no repetition loops |
| Vision | ✅ Full VL support |
| Tool Calling | ✅ Perfect <tool_call> structural integrity |
Usage
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
model, processor = load("dealignai/Qwen3.5-VL-35B-A3B-8bit-MLX-CRACK")
config = load_config("dealignai/Qwen3.5-VL-35B-A3B-8bit-MLX-CRACK")
# Text generation
prompt = apply_chat_template(processor, config, "Your prompt here", num_images=0)
output = generate(model, processor, prompt, max_tokens=500, verbose=True)
# Vision (with image)
prompt = apply_chat_template(processor, config, "Describe this image", num_images=1)
output = generate(model, processor, prompt, max_tokens=500, verbose=True, image=["path/to/image.png"])
How This Model Was Modified
Created using CRACK — targeted weight-level surgery developed for hybrid SSM/Attention and MoE architectures. Uses multi-vector alignment with per-layer extracted refusal vectors.
Methodology
The CRACK Qwen 3.5 Family
| Model | Architecture | Quant | Speed | Size | Access | Link |
|---|---|---|---|---|---|---|
| 2B | Dense | 4-bit | 248 tok/s | 1.6 GB | Free | Qwen3.5-VL-2B-4bit |
| 2B | Dense | 8-bit | 187 tok/s | 2.6 GB | Free | Qwen3.5-VL-2B-8bit |
| 4B | Dense | 4-bit | 150 tok/s | 2.9 GB | Free | Qwen3.5-VL-4B-4bit |
| 4B | Dense | 8-bit | 105 tok/s | 4.8 GB | Free | Qwen3.5-VL-4B-8bit |
| 9B | Dense | 4-bit | 103 tok/s | 5.6 GB | Free | Qwen3.5-VL-9B-4bit |
| 9B | Dense | 8-bit | 66 tok/s | 9.8 GB | Free | Qwen3.5-VL-9B-8bit |
| 35B | MoE (A3B) | 4-bit | ~88 tok/s | ~18.5 GB | Free | Qwen3.5-VL-35B-A3B-4bit |
| 35B | MoE (A3B) | 8-bit | ~80 tok/s | ~35 GB | Free | This model |
| 122B | MoE (A10B) | 4-bit | 56+ tok/s | 65 GB | Free | Qwen3.5-VL-122B-4bit |
| 122B | MoE (A10B) | 6-bit | — | ~85 GB | Gated | Qwen3.5-VL-122B-6bit |
| 122B | MoE (A10B) | 8-bit | — | ~110 GB | Gated | Qwen3.5-VL-122B-8bit |
About
Built by Dealign.AI — independent research into safety mechanisms in frontier AI models.
See our research: Safety Generalization in Frontier MoE Models
Follow us: X @dealignai
Base model: Qwen/Qwen3.5-35B-A3B-Instruct
License
Released under the Apache License 2.0, consistent with the original Qwen 3.5 base model. Provided "as-is" for research purposes.
Support dealignai
All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.
Support us on Ko-fi — check out the Ko-fi membership for early access and extras.
Have questions or need help with a specific model? DM us — we help for free most of the time.
Ko-fi | X @dealignai | dealign.ai
- Downloads last month
- 477
8-bit