Papers
arxiv:2604.11811

M^star: Every Task Deserves Its Own Memory Harness

Published on Apr 10
Authors:
,
,
,
,
,
,

Abstract

M$^\star$ automatically discovers task-optimized memory systems for large language model agents through program evolution, demonstrating superior performance across conversation, embodied planning, and expert reasoning tasks.

AI-generated summary

Large language model agents rely on specialized memory systems to accumulate and reuse knowledge during extended interactions. Recent architectures typically adopt a fixed memory design tailored to specific domains, such as semantic retrieval for conversations or skills reused for coding. However, a memory system optimized for one purpose frequently fails to transfer to others. To address this limitation, we introduce M^star, a method that automatically discovers task-optimized memory harnesses through executable program evolution. Specifically, M^star models an agent memory system as a memory program written in Python. This program encapsulates the data Schema, the storage Logic, and the agent workflow Instructions. We optimize these components jointly using a reflective code evolution method; this approach employs a population-based search strategy and analyzes evaluation failures to iteratively refine the candidate programs. We evaluate M^star on four distinct benchmarks spanning conversation, embodied planning, and expert reasoning. Our results demonstrate that M^star improves performance over existing fixed-memory baselines robustly across all evaluated tasks. Furthermore, the evolved memory programs exhibit structurally distinct processing mechanisms for each domain. This finding indicates that specializing the memory mechanism for a given task explores a broad design space and provides a superior solution compared to general-purpose memory paradigms.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.11811
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.11811 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.11811 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.11811 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.