2026 - a LilRain17 Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

LilRain17 's Collections

2026

CL

LLM

Agent

2026

updated 1 day ago

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published 24 days ago • 139
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Paper • 2512.24873 • Published 24 days ago • 102
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Paper • 2512.23343 • Published 26 days ago • 28
Scaling Open-Ended Reasoning to Predict the Future

Paper • 2512.25070 • Published 24 days ago • 16
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Paper • 2512.23988 • Published 25 days ago • 16
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking

Paper • 2512.24297 • Published 25 days ago • 6
Valori: A Deterministic Memory Substrate for AI Systems

Paper • 2512.22280 • Published about 1 month ago • 4
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published 26 days ago • 108
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published 24 days ago • 60
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published 24 days ago • 117
Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published 24 days ago • 40
SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

Paper • 2512.24330 • Published 25 days ago • 35
The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving

Paper • 2601.00747 • Published 22 days ago • 19
Diversity or Precision? A Deep Dive into Next Token Prediction

Paper • 2512.22955 • Published 27 days ago • 8
Fast-weight Product Key Memory

Paper • 2601.00671 • Published 22 days ago • 5
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 82
Can We Trust AI Explanations? Evidence of Systematic Underreporting in Chain-of-Thought Reasoning

Paper • 2601.00830 • Published about 1 month ago • 3
SimpleMem: Efficient Lifelong Memory for LLM Agents

Paper • 2601.02553 • Published 19 days ago • 37
Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

Paper • 2601.02346 • Published 19 days ago • 26
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Paper • 2601.01576 • Published 20 days ago • 18
Confidence Estimation for LLMs in Multi-turn Interactions

Paper • 2601.02179 • Published 19 days ago • 15
CPPO: Contrastive Perception for Vision Language Policy Optimization

Paper • 2601.00501 • Published 23 days ago • 7
Project Ariadne: A Structural Causal Framework for Auditing Faithfulness in LLM Agents

Paper • 2601.02314 • Published 19 days ago • 2
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published 18 days ago • 46
NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published 20 days ago • 42
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Paper • 2512.23412 • Published 26 days ago • 39
CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving

Paper • 2601.01874 • Published 19 days ago • 19
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models

Paper • 2601.01321 • Published 21 days ago • 18
WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Paper • 2601.02439 • Published 19 days ago • 16
Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

Paper • 2601.02996 • Published 18 days ago • 5
Steerability of Instrumental-Convergence Tendencies in LLMs

Paper • 2601.01584 • Published 20 days ago • 1
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 19 days ago • 102
Evolving Programmatic Skill Networks

Paper • 2601.03509 • Published 18 days ago • 80
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

Paper • 2601.03872 • Published 17 days ago • 41
Agentic Rubrics as Contextual Verifiers for SWE Agents

Paper • 2601.04171 • Published 17 days ago • 11
MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

Paper • 2601.02075 • Published 19 days ago • 8
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts

Paper • 2601.03315 • Published 18 days ago • 6
MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents

Paper • 2601.03236 • Published 18 days ago • 3
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 16 days ago • 204
RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published 16 days ago • 29
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

Paper • 2601.04767 • Published 16 days ago • 27
Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Paper • 2601.04575 • Published 16 days ago • 8
Agent-as-a-Judge

Paper • 2601.05111 • Published 16 days ago • 18
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Paper • 2601.03425 • Published 18 days ago • 16
DocDancer: Towards Agentic Document-Grounded Information Seeking

Paper • 2601.05163 • Published 16 days ago • 5
One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

Paper • 2601.03111 • Published 18 days ago • 9
AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering

Paper • 2601.04620 • Published 16 days ago • 3
Learning User Preferences Through Interaction for Long-Term Collaboration

Paper • 2601.02702 • Published 18 days ago • 2
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 16 days ago • 160
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published 15 days ago • 49
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published 15 days ago • 43
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis

Paper • 2601.05808 • Published 15 days ago • 36
AgentOCR: Reimagining Agent History via Optical Self-Compression

Paper • 2601.04786 • Published 16 days ago • 28
Can We Predict Before Executing Machine Learning Agents?

Paper • 2601.05930 • Published 15 days ago • 26
An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift

Paper • 2601.05882 • Published 15 days ago • 20
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Paper • 2601.05905 • Published 15 days ago • 18
SmartSearch: Process Reward-Guided Query Refinement for Search Agents

Paper • 2601.04888 • Published 16 days ago • 9
Over-Searching in Search-Augmented Large Language Models

Paper • 2601.05503 • Published 16 days ago • 6
DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

Paper • 2601.04823 • Published 16 days ago • 6
Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning

Paper • 2601.04726 • Published 16 days ago • 6
TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration

Paper • 2601.04544 • Published 17 days ago • 6
IIB-LPO: Latent Policy Optimization via Iterative Information Bottleneck

Paper • 2601.05870 • Published 15 days ago • 3
Distilling Feedback into Memory-as-a-Tool

Paper • 2601.05960 • Published 15 days ago • 2
BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published 14 days ago • 188
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published 15 days ago • 79
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors

Paper • 2601.07226 • Published 12 days ago • 30
Dr. Zero: Self-Evolving Search Agents without Training Data

Paper • 2601.07055 • Published 13 days ago • 20
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Paper • 2601.07779 • Published 12 days ago • 26
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction

Paper • 2601.05107 • Published 16 days ago • 22
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration

Paper • 2601.06860 • Published 13 days ago • 16
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

Paper • 2601.07526 • Published 12 days ago • 21
Forest Before Trees: Latent Superposition for Efficient Visual Reasoning

Paper • 2601.06803 • Published 13 days ago • 10
TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning

Paper • 2601.04698 • Published 16 days ago • 10
How Do Large Language Models Learn Concepts During Continual Pre-Training?

Paper • 2601.03570 • Published 17 days ago • 4
OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Paper • 2601.07376 • Published 12 days ago • 6
ShowUI-Aloha: Human-Taught GUI Agent

Paper • 2601.07181 • Published 12 days ago • 3
Are LLM Decisions Faithful to Verbal Confidence?

Paper • 2601.07767 • Published 12 days ago • 4
Structured Episodic Event Memory

Paper • 2601.06411 • Published 15 days ago • 4
Artificial Entanglement in the Fine-Tuning of Large Language Models

Paper • 2601.06788 • Published 13 days ago • 3
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Paper • 2601.08225 • Published 11 days ago • 50
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Paper • 2601.06487 • Published 14 days ago • 49
On the Non-decoupling of Supervised Fine-tuning and Reinforcement Learning in Post-training

Paper • 2601.07389 • Published 12 days ago • 2
MemoBrain: Executive Memory as an Agentic Brain for Reasoning

Paper • 2601.08079 • Published 12 days ago • 37
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

Paper • 2601.06789 • Published 13 days ago • 75
The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents

Paper • 2601.07264 • Published 12 days ago • 24
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation

Paper • 2601.08670 • Published 11 days ago • 19
Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

Paper • 2601.04582 • Published 16 days ago • 10
JudgeRLVR: Judge First, Generate Second for Efficient Reasoning

Paper • 2601.08468 • Published 11 days ago • 6
EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Paper • 2601.06786 • Published 13 days ago • 6
Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published 12 days ago • 110
MAXS: Meta-Adaptive Exploration with LLM Agents

Paper • 2601.09259 • Published 10 days ago • 93
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines

Paper • 2601.09465 • Published 10 days ago • 40
OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG

Paper • 2601.09028 • Published 11 days ago • 33
ExpSeek: Self-Triggered Experience Seeking for Web Agents

Paper • 2601.08605 • Published 11 days ago • 16
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Paper • 2601.08955 • Published 11 days ago • 13
No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning

Paper • 2601.06794 • Published 13 days ago • 4
The AI Hippocampus: How Far are We From Human Memory?

Paper • 2601.09113 • Published 11 days ago • 5
DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Paper • 2601.09609 • Published 10 days ago • 3
Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning

Paper • 2601.09536 • Published 10 days ago • 3
SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning

Paper • 2601.04809 • Published 16 days ago • 3
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published 11 days ago • 140
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published 10 days ago • 82
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

Paper • 2601.07641 • Published 12 days ago • 45
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

Paper • 2601.10402 • Published 9 days ago • 36
MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching

Paper • 2601.10712 • Published 9 days ago • 24
LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Paper • 2601.10129 • Published 9 days ago • 11
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

Paper • 2601.10657 • Published 9 days ago • 19
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published 14 days ago • 12
PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary

Paper • 2601.10201 • Published 9 days ago • 8
Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Paper • 2601.10338 • Published 9 days ago • 5
Memory Bank Compression for Continual Adaptation of Large Language Models

Paper • 2601.00756 • Published 22 days ago • 2
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published 11 days ago • 57
Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published 11 days ago • 140
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

Paper • 2601.11496 • Published 8 days ago • 45
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text

Paper • 2601.10355 • Published 9 days ago • 38
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Paper • 2601.11037 • Published 8 days ago • 17
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Paper • 2601.09195 • Published 10 days ago • 15
Reasoning Models Generate Societies of Thought

Paper • 2601.10825 • Published 9 days ago • 11
PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Paper • 2601.09636 • Published 10 days ago • 8
Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published 8 days ago • 7
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published 11 days ago • 37
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published 8 days ago • 29
Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

Paper • 2601.11061 • Published 8 days ago • 7
YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

Paper • 2601.08441 • Published 11 days ago • 7
CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion

Paper • 2601.09512 • Published 10 days ago • 4
Think3D: Thinking with Space for Spatial Reasoning

Paper • 2601.13029 • Published 5 days ago • 44
Toward Efficient Agents: Memory, Tool learning, and Planning

Paper • 2601.14192 • Published 4 days ago • 47
FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs

Paper • 2601.13836 • Published 4 days ago • 34
DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution

Paper • 2601.13761 • Published 4 days ago • 15
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Paper • 2601.12294 • Published 6 days ago • 15
Aligning Agentic World Models via Knowledgeable Experience Learning

Paper • 2601.13247 • Published 5 days ago • 15
Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published 8 days ago • 18
Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Paper • 2601.14249 • Published 4 days ago • 6
InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

Paper • 2601.14209 • Published 4 days ago • 5
Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

Paper • 2601.13697 • Published 4 days ago • 3
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 6 days ago • 163
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance

Paper • 2601.14171 • Published 4 days ago • 44
Behavior Knowledge Merge in Reinforced Agentic Models

Paper • 2601.13572 • Published 4 days ago • 21
Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning

Paper • 2601.14750 • Published 3 days ago • 14
Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics

Paper • 2601.14027 • Published 4 days ago • 10
Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Paper • 2601.14152 • Published 4 days ago • 4
The Responsibility Vacuum: Organizational Failure in Scaled Agent Systems

Paper • 2601.15059 • Published 3 days ago • 3
Facilitating Proactive and Reactive Guidance for Decision Making on the Web: A Design Probe with WebSeek

Paper • 2601.15100 • Published 3 days ago • 3

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs