Huihui-Qwen3.5-9B-abliterated-TIES

This is a merge of pre-trained language models created using a custom fork of mergekit with Qwen3.5 architecture support.

Merge Details

Merge Method

This model was merged using the TIES merge method using Qwen/Qwen3.5-9B-Base as a base.

Models Merged

The following models were included in the merge:

Mergekit Changes for Qwen3.5

Qwen3.5 uses a hybrid attention architecture (3:1 linear/full attention layers) that mergekit does not yet support upstream. The following changes were made to enable this merge:

New architecture definitions (qwen3_5.json, qwen3_5_text.json) - Defines tensor mappings for both Qwen3_5ForConditionalGeneration (VLM) and Qwen3_5ForCausalLM (text-only) architectures. The hybrid self_attn and linear_attn layer weights are marked as optional since they only appear on specific layers (full attention every 4th layer, linear attention on the rest).
Config key fallback (mergekit/common.py) - Added a fallback in get_config_value so that nested config keys like text_config.num_hidden_layers gracefully resolve to num_hidden_layers on text-only model configs. This enables cross-architecture merges between VLM and text-only Qwen3.5 variants.
Vision weight grafting - The SFT and ORPO fine-tuned models are text-only (Qwen3_5ForCausalLM) and lack the vision encoder. After the text merge, the vision encoder (model.visual.*) and MTP head (mtp.*) weights were grafted from the base VLM model (Qwen/Qwen3.5-9B-Base) to produce a complete VLM.

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: huihui-ai/Huihui-Qwen3.5-9B-abliterated
    parameters:
      weight: 0.4
      density: 0.6
  - model: nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SFT
    parameters:
      weight: 0.3
      density: 0.6
  - model: nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-ORPO
    parameters:
      weight: 0.5
      density: 0.6
merge_method: ties
base_model: Qwen/Qwen3.5-9B-Base
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16