Articles
updated
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper
• 2311.00176
• Published
• 9
Language Models can be Logical Solvers
Paper
• 2311.06158
• Published
• 20
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal
Language Models
Paper
• 2311.05997
• Published
• 37
Lumos: Learning Agents with Unified Data, Modular Design, and
Open-Source LLMs
Paper
• 2311.05657
• Published
• 30
JaxMARL: Multi-Agent RL Environments in JAX
Paper
• 2311.10090
• Published
• 8
ML-Bench: Large Language Models Leverage Open-source Libraries for
Machine Learning Tasks
Paper
• 2311.09835
• Published
• 11
Large Language Models for Mathematicians
Paper
• 2312.04556
• Published
• 12
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Paper
• 2312.11370
• Published
• 20
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper
• 2401.00935
• Published
• 18
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
• 2403.04642
• Published
• 49
LLM Agent Operating System
Paper
• 2403.16971
• Published
• 73
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published
• 288
Evolving Deeper LLM Thinking
Paper
• 2501.09891
• Published
• 115
AgentRxiv: Towards Collaborative Autonomous Research
Paper
• 2503.18102
• Published
• 25
TTRL: Test-Time Reinforcement Learning
Paper
• 2504.16084
• Published
• 120
Learning Adaptive Parallel Reasoning with Language Models
Paper
• 2504.15466
• Published
• 44
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
Abilities
Paper
• 2504.16078
• Published
• 21
FlowReasoner: Reinforcing Query-Level Meta-Agents
Paper
• 2504.15257
• Published
• 47
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
• 2504.17192
• Published
• 123
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical
Reasoning Models with OpenMathReasoning dataset
Paper
• 2504.16891
• Published
• 25
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
Paper
• 2504.16656
• Published
• 57
Flow-GRPO: Training Flow Matching Models via Online RL
Paper
• 2505.05470
• Published
• 88
Measuring General Intelligence with Generated Games
Paper
• 2505.07215
• Published
• 11
Enigmata: Scaling Logical Reasoning in Large Language Models with
Synthetic Verifiable Puzzles
Paper
• 2505.19914
• Published
• 46
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive
Programming?
Paper
• 2506.11928
• Published
• 24
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just
Like an Olympiad Team
Paper
• 2506.14234
• Published
• 41
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm
Engineering
Paper
• 2506.09050
• Published
• 6
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning
in LLMs
Paper
• 2506.15211
• Published
• 39
SwarmAgentic: Towards Fully Automated Agentic System Generation via
Swarm Intelligence
Paper
• 2506.15672
• Published
• 15
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
System from Hypothesis to Verification
Paper
• 2505.16938
• Published
• 121
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code
Generation
Paper
• 2506.20639
• Published
• 31
Inverse Reinforcement Learning Meets Large Language Model Post-Training:
Basics, Advances, and Opportunities
Paper
• 2507.13158
• Published
• 24
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement
Learning
Paper
• 2507.14111
• Published
• 25
Speed Always Wins: A Survey on Efficient Architectures for Large
Language Models
Paper
• 2508.09834
• Published
• 53
Deep Think with Confidence
Paper
• 2508.15260
• Published
• 90
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper
• 2509.07980
• Published
• 104
Revolutionizing Reinforcement Learning Framework for Diffusion Large
Language Models
Paper
• 2509.06949
• Published
• 56
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI
Agents
Paper
• 2509.06917
• Published
• 43
The Majority is not always right: RL training for solution aggregation
Paper
• 2509.06870
• Published
• 15
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
Tasks?
Paper
• 2509.16941
• Published
• 21
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon
Scenarios
Paper
• 2509.21766
• Published
• 24
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published
• 509
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline
Paper
• 2510.07307
• Published
• 6
Parallel Test-Time Scaling for Latent Reasoning Models
Paper
• 2510.07745
• Published
• 7
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
Paper
• 2510.14943
• Published
• 40
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Paper
• 2601.08808
• Published
• 39
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
Paper
• 2602.03837
• Published
• 5