Any chance you'd consider releasing the LoRA adapter separately?
Hey, first off, really impressive work on this distillation. The autonomy improvements in agentic environments especially caught my eye.
Any chance you'd consider releasing the LoRA adapter separately? Would love to experiment with merging it onto different base checkpoints and see how the reasoning style transfers. Totally understand if there are reasons to keep it bundled, but figured it was worth asking!
Thanks either way for putting this out in the open.
Hey, first off, really impressive work on this distillation. The autonomy improvements in agentic environments especially caught my eye.
Any chance you'd consider releasing the LoRA adapter separately? Would love to experiment with merging it onto different base checkpoints and see how the reasoning style transfers. Totally understand if there are reasons to keep it bundled, but figured it was worth asking!
Thanks either way for putting this out in the open.
Hi, thanks for your support.
The v1 LoRA adapter has already been overwritten because there were many updates during development. You can try the latest v3 version — it’s the most improved one.
@Jackrong I tested the consolidated v3 and found it very interesting, it turned out really good (better than v1). The reason I suggest LoRA is because it would be incredible to be able to apply this same style to other fine-tuned models. Without the adapter, the only alternative ends up being trying to generate a LoRA via comparison between models, which inevitably loses some fidelity.
Would you consider saving and making available the LoRA weights from this v3 training, even if it's as a separate experimental version? It doesn't need to be super refined, just the raw adapter would already help a lot for those who want to explore and adapt this to other setups.
Otherwise, excellent work, it's turning out very well.