Commit ·
777470d
1
Parent(s): 5309422
Update README.md (#4)
Browse files- Update README.md (7aa8e4ede89cb3503af46185fad6639b1d481f2c)
Co-authored-by: Yizhe Zhang <YizheZ@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -15,6 +15,11 @@ This model is an example of the **Simple Self-Distillation (SimpleSD)** method t
|
|
| 15 |
- **Self-distillation sampling:** temperature=1.1, top_p=0.95, top_k=20
|
| 16 |
- **Evaluation sampling:** temperature=0.7, top_p=0.95, top_k=20
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
## Notes
|
| 19 |
- These are research checkpoints for reproducibility.
|
| 20 |
- They are not optimized Qwen releases.
|
|
|
|
| 15 |
- **Self-distillation sampling:** temperature=1.1, top_p=0.95, top_k=20
|
| 16 |
- **Evaluation sampling:** temperature=0.7, top_p=0.95, top_k=20
|
| 17 |
|
| 18 |
+
paper: https://arxiv.org/abs/2604.01193
|
| 19 |
+
|
| 20 |
+
code: https://github.com/apple/ml-ssd
|
| 21 |
+
|
| 22 |
+
|
| 23 |
## Notes
|
| 24 |
- These are research checkpoints for reproducibility.
|
| 25 |
- They are not optimized Qwen releases.
|