SeaWolf-AI commited on
Commit
2aa7b0e
Β·
verified Β·
1 Parent(s): b9d9f99

Create readme

Browse files
Files changed (1) hide show
  1. readme +212 -0
readme ADDED
@@ -0,0 +1,212 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen3.5-9B
5
+ tags:
6
+ - merge
7
+ - evolutionary-merge
8
+ - darwin
9
+ - darwin-v5
10
+ - model-mri
11
+ - reasoning
12
+ - advanced-reasoning
13
+ - chain-of-thought
14
+ - thinking
15
+ - qwen3.5
16
+ - qwen
17
+ - claude-opus
18
+ - distillation
19
+ - multilingual
20
+ - benchmark
21
+ - open-source
22
+ - apache-2.0
23
+ - layer-wise-merge
24
+ - coding-agent
25
+ - tool-calling
26
+ - long-context
27
+ language:
28
+ - en
29
+ - zh
30
+ - ko
31
+ - ja
32
+ - de
33
+ - fr
34
+ - es
35
+ - ru
36
+ - ar
37
+ - multilingual
38
+ pipeline_tag: text-generation
39
+ library_name: transformers
40
+ model-index:
41
+ - name: Darwin-9B-Opus
42
+ results:
43
+ - task:
44
+ type: text-generation
45
+ name: Graduate-Level Reasoning
46
+ dataset:
47
+ type: Idavidrein/gpqa
48
+ name: GPQA Diamond
49
+ config: gpqa_diamond
50
+ split: train
51
+ metrics:
52
+ - type: accuracy
53
+ value: 90.0
54
+ name: Accuracy
55
+ verified: false
56
+ ---
57
+
58
+ # Darwin-9B-Opus
59
+
60
+ *"Compact reasoning powerhouse β€” 9B parameters, graduate-level intelligence."*
61
+
62
+ <p align="center">
63
+ <a href="https://huggingface.co/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/🧬_Model-Darwin--9B--Opus-blue?style=for-the-badge" alt="Model"></a>
64
+ <a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/πŸš€_Space-Live_Demo-purple?style=for-the-badge" alt="Space"></a>
65
+ <a href="https://huggingface.co/spaces/FINAL-Bench/Leaderboard"><img src="https://img.shields.io/badge/πŸ†_FINAL_Bench-Leaderboard-green?style=for-the-badge" alt="FINAL Bench"></a>
66
+ <a href="https://huggingface.co/spaces/FINAL-Bench/all-bench-leaderboard"><img src="https://img.shields.io/badge/πŸ“Š_ALL_Bench-Leaderboard-orange?style=for-the-badge" alt="ALL Bench"></a>
67
+ </p>
68
+
69
+ > **Qwen3.5 Dense 9B** | Reasoning | Chain-of-Thought | 131K Context | 201 Languages | BF16 | Apache 2.0
70
+
71
+ ---
72
+
73
+ ## Overview
74
+
75
+ Darwin-9B-Opus is a **9B dense parameter** reasoning model created using **Darwin V5**, an evolutionary merge engine with Model MRI integration. Built on the Qwen3.5-9B architecture, it inherits structured step-by-step reasoning capabilities through Claude 4.6 Opus distillation while maintaining the full multilingual and long-context capabilities of the base model.
76
+
77
+ ---
78
+
79
+ ## Model Specifications
80
+
81
+ | | |
82
+ |---|---|
83
+ | Architecture | Qwen3.5 Dense |
84
+ | Total Parameters | 9B |
85
+ | Precision | BF16 |
86
+ | Context Length | 131,072 native |
87
+ | Languages | 201 |
88
+ | Thinking | `<think>` tag chain-of-thought reasoning |
89
+ | License | Apache 2.0 |
90
+
91
+ ---
92
+
93
+ ## Hardware Requirements
94
+
95
+ | Setup | VRAM | Status |
96
+ |---|---|---|
97
+ | BF16 Full Precision | ~20 GB | |
98
+ | NVIDIA A10G 24GB | 24 GB | βœ… Comfortable |
99
+ | NVIDIA RTX 4090 24GB | 24 GB | βœ… Comfortable |
100
+ | NVIDIA A100 40GB | 40 GB | βœ… Very comfortable |
101
+ | NVIDIA T4 16GB | 16 GB | ⚠️ Requires quantization |
102
+
103
+ ---
104
+
105
+ ## Usage
106
+
107
+ ### Transformers
108
+
109
+ ```python
110
+ from transformers import AutoTokenizer, AutoModelForCausalLM
111
+ import torch
112
+
113
+ tokenizer = AutoTokenizer.from_pretrained(
114
+ "FINAL-Bench/Darwin-9B-Opus",
115
+ trust_remote_code=True,
116
+ )
117
+ model = AutoModelForCausalLM.from_pretrained(
118
+ "FINAL-Bench/Darwin-9B-Opus",
119
+ torch_dtype=torch.bfloat16,
120
+ device_map="auto",
121
+ trust_remote_code=True,
122
+ )
123
+
124
+ messages = [{"role": "user", "content": "Prove that √2 is irrational."}]
125
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
126
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
127
+ outputs = model.generate(**inputs, max_new_tokens=4096)
128
+ print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
129
+ ```
130
+
131
+ ### SGLang
132
+
133
+ ```bash
134
+ python -m sglang.launch_server \
135
+ --model-path FINAL-Bench/Darwin-9B-Opus \
136
+ --tp 1 \
137
+ --mem-fraction-static 0.90 \
138
+ --context-length 32768 \
139
+ --trust-remote-code
140
+ ```
141
+
142
+ ### vLLM
143
+
144
+ ```bash
145
+ vllm serve FINAL-Bench/Darwin-9B-Opus \
146
+ --trust-remote-code \
147
+ --enforce-eager
148
+ ```
149
+
150
+ ---
151
+
152
+ ## What Makes Darwin Special?
153
+
154
+ Darwin-9B-Opus was created using **Darwin V5**, an evolutionary merge engine with Model MRI integration.
155
+
156
+ ### Darwin V5 Pipeline
157
+
158
+ ```
159
+ [Phase 0] Model MRI β€” Profile both parents layer by layer
160
+ ↓ Measure: layer importance, probe cosine distance
161
+ ↓
162
+ [Phase 1] MRI-Guided Evolution β€” Diagnostic-informed initial genome
163
+ ↓ Not random, but "informed by profiling results"
164
+ ↓
165
+ [Phase 2] mergekit real merge + benchmark fitness selection
166
+ ↓ Faster convergence in MRI-narrowed search space
167
+ ↓
168
+ [Phase 3] MRI Health Check β€” Profile the child model
169
+ ↓ Detect interference, function loss
170
+ ↓ Prescribe layer-specific ratio adjustments
171
+ ↓
172
+ [Final] Darwin-9B-Opus
173
+ ```
174
+
175
+ ---
176
+
177
+ ## Built By
178
+
179
+ | | |
180
+ |---|---|
181
+ | Developer | **VIDRAFT** |
182
+ | Engine | Darwin V5 (Evolutionary Merge + Model MRI) |
183
+ | Merge Backend | mergekit (DARE-TIES) |
184
+ | Base Architecture | Qwen3.5-9B |
185
+
186
+ ---
187
+
188
+ ## Acknowledgements
189
+
190
+ - **Korean Government** β€” GPU Support Program research grant
191
+ - [Qwen Team](https://huggingface.co/Qwen) β€” Qwen3.5 base architecture
192
+ - [mergekit](https://github.com/arcee-ai/mergekit) β€” Merge backend infrastructure
193
+
194
+ ---
195
+
196
+ ## Citation
197
+
198
+ ```bibtex
199
+ @misc{vidraft_darwin_9b_opus,
200
+ title = {Darwin-9B-Opus: Compact Reasoning Model via Diagnostic-Guided Evolutionary Merge},
201
+ author = {VIDRAFT},
202
+ year = {2026},
203
+ publisher = {Hugging Face},
204
+ howpublished = {\url{https://huggingface.co/FINAL-Bench/Darwin-9B-Opus}}
205
+ }
206
+ ```
207
+
208
+ ---
209
+
210
+ ## Contact
211
+
212
+ πŸ“§ **kkms1116@koreacu.ac.kr**