15 10 9

khtsly

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

upvoted a paper 3 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

upvoted a paper 3 days ago

HRM-Text: Efficient Pretraining Beyond Scaling

View all activity

Organizations

None yet

upvoted a paper 1 day ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Paper • 2605.23901 • Published 4 days ago • 9

upvoted 2 papers 3 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 14 days ago • 191

HRM-Text: Efficient Pretraining Beyond Scaling

Paper • 2605.20613 • Published 6 days ago • 65

upvoted a paper 4 days ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Paper • 2605.22791 • Published 5 days ago • 26

upvoted a paper 5 days ago

Generative Recursive Reasoning

Paper • 2605.19376 • Published 6 days ago • 27

liked a model 30 days ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated 30 days ago • 433 • 2

published a model 30 days ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated 30 days ago • 433 • 2

updated a model 30 days ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated 30 days ago • 433 • 2

updated a dataset about 1 month ago

khtsly/roblox_docs_corpus_text

Viewer • Updated Apr 23 • 1.55k • 31 • 1

New activity in Jackrong/Qwopus-GLM-18B-Merged-GGUF about 1 month ago

merging problem

👀 1

#5 opened about 1 month ago by

khtsly

New activity in google/gemma-4-31B-it about 1 month ago

Can anyone improve the model using the Rys methodology—by duplicating a block of layers?

#60 opened about 1 month ago by

Regrin

updated a dataset about 1 month ago

khtsly/luau-repo-docs-text

Viewer • Updated Apr 16 • 1.64k • 49 • 1

New activity in Kassadin88/GLM-5.1-1000000x about 1 month ago

â character

#3 opened about 1 month ago by

khtsly

upvoted a paper about 1 month ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

updated 2 datasets about 1 month ago

khtsly/devforum-roblox-text

Viewer • Updated Apr 13 • 171k • 118 • 1

khtsly/luau-stack-hq

Viewer • Updated Apr 13 • 25.2k • 173 • 2

updated a model about 2 months ago

khtsly/luau-coder-1.5-preview-tokenizer

Updated Apr 9

published a model about 2 months ago

khtsly/luau-coder-1.5-preview-tokenizer

Updated Apr 9

New activity in omarkamali/wikipedia-monthly about 2 months ago

Hashtag (category)

#6 opened about 2 months ago by

khtsly

liked a dataset about 2 months ago

khtsly/devforum-roblox-text

Viewer • Updated Apr 13 • 171k • 118 • 1

khtsly

AI & ML interests

Recent Activity

Organizations

khtsly's activity

merging problem

Can anyone improve the model using the Rys methodology—by duplicating a block of layers?

â character

Hashtag (category)

â character