Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Commit History
Wave 17: close all 5 audit FLAGs + SDPO context alignment + serverless re-exports a84c060
Wave 16: install ergonomics + gradient evidence + SDPO end-to-end example c0a5ab7
Wave 15: 4-angle multi-model self-critique caught 2 math BLOCKERs in primary loss kernels; fixed against upstream byte-for-byte + GSM8K example + ergonomics e5add15
Wave 14: close every Wave 13 review finding + 4 documentation files; Wave 14b: real PRIME-RL parity + multi-process DiLoCo convergence d9dd3a5
Wave 13: serverless DiLoCo + replaysim normalization + 3 distillation losses + PRIME-RL + Monarch b266c31
Wave 12: close V1-V8 brief — GPU smoke, SDPO firing, real-trace e2e d88715c
Wave 11: cross-model adversarial review + honest down-revision f16fa23
Wave 7: Phase 2-4 of deep work loop — backlog, parallel research, three ADRs ac4bfb4
Wave 6: vision validation self-audit (5/10 to 9/10 in 5 days, no GPU) 040eff8
baladithyab commited on
Wave 3: integration architecture + spike-005 trainer skeleton (16 tests pass) fd77f74
baladithyab commited on
Integrate Cursor blog directly + audit research note + add SDPO/OPSD link 1cede23
baladithyab commited on