RESMP-DEV
/

Accessible_Qwen_4B

Model card Files Files and versions

Model Card for Acc Qwen 4B

Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.

The code for training the model is at https://github.com/Nottlespike/Accessible_Qwen

Downloads last month: -

Safetensors

Model size

4B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RESMP-DEV/Accessible_Qwen_4B

Base model

Qwen/Qwen3-4B-Base

Finetuned

Finetuned

(701)

this model

Quantizations

Dataset used to train RESMP-DEV/Accessible_Qwen_4B