jian

lipliu

·

cquliujian

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Towards Automating Scientific Review with Google's Paper Assistant Tool

upvoted a paper 11 days ago

GLM-5: from Vibe Coding to Agentic Engineering

upvoted a paper 21 days ago

Self-Distilled Agentic Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Towards Automating Scientific Review with Google's Paper Assistant Tool

Paper • 2606.28277 • Published 7 days ago • 5

upvoted a paper 11 days ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 195

upvoted 3 papers 21 days ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 116

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published May 8 • 102

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

Paper • 2605.10899 • Published May 11 • 79

upvoted a paper 25 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Paper • 2606.02437 • Published Jun 1 • 236

upvoted a paper 28 days ago

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published May 20 • 111

upvoted 11 papers 29 days ago

Code as Agent Harness

Paper • 2605.18747 • Published May 18 • 223

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Paper • 2605.27366 • Published May 26 • 29

Rethinking Memory as Continuously Evolving Connectivity

Paper • 2605.28773 • Published May 27 • 34

Agent Explorative Policy Optimization for Multimodal Agentic Reasoning

Paper • 2605.28774 • Published May 27 • 93

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published May 22 • 249

ESPO: Early-Stopping Proximal Policy Optimization

Paper • 2605.29860 • Published May 28 • 20

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

Paper • 2606.03979 • Published Jun 2 • 29

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

Paper • 2606.01311 • Published May 31 • 37

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published May 31 • 46

Trust-Region Behavior Blending for On-Policy Distillation

Paper • 2605.31159 • Published May 29 • 68

Self-Distilled Policy Gradient

Paper • 2606.04036 • Published Jun 2 • 27

upvoted a paper 3 months ago

TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published Mar 27 • 145

upvoted a paper 4 months ago

Reasoning Models Struggle to Control their Chains of Thought

Paper • 2603.05706 • Published Mar 5 • 39