π§ Qwen 3.5 4B β Cagatay LoRA
A LoRA fine-tune of Qwen/Qwen3.5-4B for robotics reasoning and instruction following.
π― What is this?
A LoRA adapter built on Qwen 3.5 4B for embodied AI task reasoning. Fine-tuned using SFT (Supervised Fine-Tuning) via TRL on HuggingFace Jobs infrastructure.
π Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-4B |
| Method | LoRA (PEFT) + SFT (TRL) |
| Rank (r) | 32 |
| Alpha | 64 |
| Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Adapter Size | 100 MB |
| Framework | TRL 0.29.1, Transformers 5.3.0, PyTorch 2.10.0, PEFT 0.18.1 |
| Training | HuggingFace Jobs (cloud GPU) |
π Quick Start
from transformers import pipeline
generator = pipeline(
"text-generation",
model="cagataydev/qwen3.5-4B-cagatay",
device="cuda"
)
# Robotics reasoning
output = generator(
[{"role": "user", "content": "Plan the steps to pick up a cup from the table and place it in the sink."}],
max_new_tokens=256,
return_full_text=False
)[0]
print(output["generated_text"])
With PEFT (explicit adapter loading)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B", torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(base, "cagataydev/qwen3.5-4B-cagatay")
tokenizer = AutoTokenizer.from_pretrained("cagataydev/qwen3.5-4B-cagatay")
π€ Use Cases
- Robotics task planning β Break down high-level commands into step-by-step plans
- Embodied reasoning β Spatial understanding and action sequencing
- Lightweight deployment β 4B params fits on edge devices (Jetson, consumer GPUs)
- Neon VLA component β Part of the Neon VLA vision-language-action stack
π¦ Related Models
| Model | Base | Params | Use Case |
|---|---|---|---|
| qwen2.5-omni-3b-cagatay | Qwen 2.5 3B | 1.8B | Voice command understanding |
| qwen3.5-4B-cagatay | Qwen 3.5 4B | 4B | Task planning (this model) |
| qwen3.5-35B-A3B-cagatay | Qwen 3.5 35B MoE | 35B (3B active) | Complex multi-step reasoning |
Built with DevDuck π¦ | Trained on HuggingFace Jobs | Part of the Neon VLA ecosystem