20 26

Xiangyu

xixy

https://xixy.github.io/

AI & ML interests

None yet

Recent Activity

commentedon a paper 4 days ago

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

upvoted a paper 4 days ago

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

upvoted a paper 18 days ago

TMAS: Scaling Test-Time Compute via Multi-Agent Synergy

View all activity

Organizations

None yet

commented a paper 4 days ago

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

Paper • 2605.24117 • Published 8 days ago • 17 •

upvoted a paper 4 days ago

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

Paper • 2605.24117 • Published 8 days ago • 17

upvoted a paper 18 days ago

TMAS: Scaling Test-Time Compute via Multi-Agent Synergy

Paper • 2605.10344 • Published 19 days ago • 49

upvoted 2 papers 24 days ago

Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

Paper • 2605.03596 • Published 25 days ago • 10

HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Paper • 2605.02396 • Published 26 days ago • 24

upvoted a paper 30 days ago

ClawGym: A Scalable Framework for Building Effective Claw Agents

Paper • 2604.26904 • Published about 1 month ago • 50

authored a paper about 2 months ago

On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning

Paper • 2604.01702 • Published Apr 4 • 3

upvoted 2 papers about 2 months ago

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

Paper • 2604.04323 • Published Apr 6 • 41

On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning

Paper • 2604.01702 • Published Apr 4 • 3

commented a paper about 2 months ago

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published Apr 1 • 54 •

authored a paper 2 months ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 78

New activity in Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled 3 months ago

Claude distillation

➕❤️ 2

#1 opened 3 months ago by

gergopool

upvoted a paper 3 months ago

How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

Paper • 2603.02578 • Published Mar 3 • 25

authored 6 papers 4 months ago

upvoted a paper 4 months ago

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 181

Xiangyu

AI & ML interests

Recent Activity

Organizations

xixy's activity

Claude distillation