Xuan Yang's picture

Xuan Yang

TorresYang

·

https://torresyangx.github.io/

AI & ML interests

LLM reasoning, agent

Recent Activity

authored a paper 2 days ago

SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models

authored a paper 2 days ago

Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions

updated a collection 9 days ago

View all activity

Organizations

authored 2 papers 2 days ago

SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models

Paper • 2503.00211 • Published Feb 28, 2025

Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions

Paper • 2606.03318 • Published 12 days ago

updated a collection 9 days ago

RUT-Bench

Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". • 2 items • Updated 9 days ago

updated a collection 10 days ago

RUT-Bench

Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". • 2 items • Updated 10 days ago

New activity in Miaow-Lab/RUT-Bench 10 days ago

Add task categories and link to paper

#1 opened 10 days ago by

updated 2 collections 10 days ago

RUT-Bench

Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". • 2 items • Updated 9 days ago

RUT-Bench

Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". • 2 items • Updated 10 days ago

updated a dataset 10 days ago

Miaow-Lab/RUT-Bench

Viewer • Updated 10 days ago • 1.64k • 70

upvoted a collection 11 days ago

SSAE

Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated Mar 4 • 3

published a dataset 11 days ago

Miaow-Lab/RUT-Bench

Viewer • Updated 10 days ago • 1.64k • 70

authored a paper 3 months ago

Step-Level Sparse Autoencoder for Reasoning Process Interpretation

Paper • 2603.03031 • Published Mar 3

updated 2 collections 3 months ago

SSAE

Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated Mar 4 • 3

SSAE

Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated Mar 4

updated a model 3 months ago

Miaow-Lab/SSAE-Checkpoints

Feature Extraction • Updated Mar 4

updated a dataset 3 months ago

Miaow-Lab/SSAE-Dataset

Viewer • Updated Mar 4 • 1.28M • 49

updated 2 collections 4 months ago

SSAE

Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated Mar 4 • 3

SSAE

Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated Mar 4