50 SDF LoRA model organisms on Qwen3-14B, each instilling one behavior in a narrow domain to study its generalization reach (LoRAcle project).
mats
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
Chain-of-thought that hides what the model is really doing: cheating without saying so, latent soft-token, and filler-token reasoning.
Eval datasets for CoT Oracle: authority bias, sycophancy, hint following, decorative reasoning, and more.
Qwen3-8B MOs: load-bearing chain-of-thought in ciphers of increasing strength (odometer task).
Full-parameter Qwen3-14B fine-tunes for the LoRAcle paper's LoRA-vs-full-FT SVD-truncation appendix comparison.
Training datasets for the CoT Trajectory Oracle. Includes the v5 corpus and conversational QA pairs.
50 SDF LoRA model organisms on Qwen3-14B, each instilling one behavior in a narrow domain to study its generalization reach (LoRAcle project).
Qwen3-8B MOs: load-bearing chain-of-thought in ciphers of increasing strength (odometer task).
Chain-of-thought that hides what the model is really doing: cheating without saying so, latent soft-token, and filler-token reasoning.
Full-parameter Qwen3-14B fine-tunes for the LoRAcle paper's LoRA-vs-full-FT SVD-truncation appendix comparison.
Eval datasets for CoT Oracle: authority bias, sycophancy, hint following, decorative reasoning, and more.
Training datasets for the CoT Trajectory Oracle. Includes the v5 corpus and conversational QA pairs.