Alexey Gorbatovski's picture

Alexey Gorbatovski

Myashka

·

AI & ML interests

NLP Alignment

Recent Activity

commentedon a paper 2 days ago

Demystifying OPD: Length Inflation and Stabilization Strategies for Large Language Models

authored a paper 5 days ago

Trust-Region Behavior Blending for On-Policy Distillation

upvoted a paper 5 days ago

Trust-Region Behavior Blending for On-Policy Distillation

View all activity

Organizations

None yet

commented a paper 2 days ago

Demystifying OPD: Length Inflation and Stabilization Strategies for Large Language Models

Paper • 2604.08527 • Published Apr 9 • 1 •

commented a paper 8 months ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published Oct 21, 2025 • 85 •

New activity in agentica-org/DeepScaleR-Preview-Dataset 8 months ago

There are no answers for 6 samples

#4 opened 8 months ago by

New activity in Myashka/CryptoNews_50_50 about 2 years ago

Librarian Bot: Add language metadata for dataset

#2 opened about 2 years ago by