Abstract
PAHF framework enables continual personalization of AI agents through explicit user memory and dual feedback channels, allowing rapid adaptation to changing user preferences.
Modern AI agents are powerful but often fail to align with the idiosyncratic, evolving preferences of individual users. Prior approaches typically rely on static datasets, either training implicit preference models on interaction history or encoding user profiles in external memory. However, these approaches struggle with new users and with preferences that change over time. We introduce Personalized Agents from Human Feedback (PAHF), a framework for continual personalization in which agents learn online from live interaction using explicit per-user memory. PAHF operationalizes a three-step loop: (1) seeking pre-action clarification to resolve ambiguity, (2) grounding actions in preferences retrieved from memory, and (3) integrating post-action feedback to update memory when preferences drift. To evaluate this capability, we develop a four-phase protocol and two benchmarks in embodied manipulation and online shopping. These benchmarks quantify an agent's ability to learn initial preferences from scratch and subsequently adapt to persona shifts. Our theoretical analysis and empirical results show that integrating explicit memory with dual feedback channels is critical: PAHF learns substantially faster and consistently outperforms both no-memory and single-channel baselines, reducing initial personalization error and enabling rapid adaptation to preference shifts.
Community
AI agents are powerful, but don’t stay aligned with you over time.
When preferences shift, they don’t adapt. You correct them once…they repeat the mistake. 🤦
Introducing PAHF: continual personalization where agents learn from feedback to stay in sync.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Me-Agent: A Personalized Mobile Agent with Two-Level User Habit Learning for Enhanced Interaction (2026)
- Cold-Start Personalization via Training-Free Priors from Structured World Models (2026)
- Synthetic Interaction Data for Scalable Personalization in Large Language Models (2026)
- M2A: Multimodal Memory Agent with Dual-Layer Hybrid Memory for Long-Term Personalized Interactions (2026)
- ShopSimulator: Evaluating and Exploring RL-Driven LLM Agent for Shopping Assistants (2026)
- Learning User Preferences Through Interaction for Long-Term Collaboration (2026)
- AlignUSER: Human-Aligned LLM Agents via World Models for Recommender System Evaluation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper