Huihui-Qwen3.5-9B-abliterated-TIES
This is a merge of pre-trained language models created using a custom fork of mergekit with Qwen3.5 architecture support.
Merge Details
Merge Method
This model was merged using the TIES merge method using Qwen/Qwen3.5-9B-Base as a base.
Models Merged
The following models were included in the merge:
- nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SFT
- nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-ORPO
- huihui-ai/Huihui-Qwen3.5-9B-abliterated
Mergekit Changes for Qwen3.5
Qwen3.5 uses a hybrid attention architecture (3:1 linear/full attention layers) that mergekit does not yet support upstream. The following changes were made to enable this merge:
New architecture definitions (
qwen3_5.json,qwen3_5_text.json) - Defines tensor mappings for bothQwen3_5ForConditionalGeneration(VLM) andQwen3_5ForCausalLM(text-only) architectures. The hybridself_attnandlinear_attnlayer weights are marked as optional since they only appear on specific layers (full attention every 4th layer, linear attention on the rest).Config key fallback (
mergekit/common.py) - Added a fallback inget_config_valueso that nested config keys liketext_config.num_hidden_layersgracefully resolve tonum_hidden_layerson text-only model configs. This enables cross-architecture merges between VLM and text-only Qwen3.5 variants.Vision weight grafting - The SFT and ORPO fine-tuned models are text-only (
Qwen3_5ForCausalLM) and lack the vision encoder. After the text merge, the vision encoder (model.visual.*) and MTP head (mtp.*) weights were grafted from the base VLM model (Qwen/Qwen3.5-9B-Base) to produce a complete VLM.
Configuration
The following YAML configuration was used to produce this model:
models:
- model: huihui-ai/Huihui-Qwen3.5-9B-abliterated
parameters:
weight: 0.4
density: 0.6
- model: nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SFT
parameters:
weight: 0.3
density: 0.6
- model: nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-ORPO
parameters:
weight: 0.5
density: 0.6
merge_method: ties
base_model: Qwen/Qwen3.5-9B-Base
parameters:
normalize: true
int8_mask: true
dtype: bfloat16
- Downloads last month
- 67