发布日期: 2025-11-08

更新日期: 2025-11-27

文章字数: 18.3k

阅读时长: 74 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-11-08 更新

Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems

Authors:Utkarsh U. Chavan, Prashant Trivedi, Nandyala Hemachandra

Multi-agent systems (MAS) are central to applications such as swarm robotics and traffic routing, where agents must coordinate in a decentralized manner to achieve a common objective. Stochastic Shortest Path (SSP) problems provide a natural framework for modeling decentralized control in such settings. While the problem of learning in SSP has been extensively studied in single-agent settings, the decentralized multi-agent variant remains largely unexplored. In this work, we take a step towards addressing that gap. We study decentralized multi-agent SSPs (Dec-MASSPs) under linear function approximation, where the transition dynamics and costs are represented using linear models. Applying novel symmetry-based arguments, we identify the structure of optimal policies. Our main contribution is the first regret lower bound for this setting based on the construction of hard-to-learn instances for any number of agents, $n$. Our regret lower bound of $\Omega(\sqrt{K})$, over $K$ episodes, highlights the inherent learning difficulty in Dec-MASSPs. These insights clarify the learning complexity of decentralized control and can further guide the design of efficient learning algorithms in multi-agent systems.

多智能体系统（MAS）在群机器人和交通路由等应用中处于核心地位，在这些应用中，智能体必须以分布式的方式协调以实现共同目标。随机最短路径（SSP）问题为这种设置中的分布式控制提供了自然的框架。虽然SSP中的学习问题在单智能体环境中已被广泛研究，但分布式多智能体变体仍未被充分探索。在这项工作中，我们朝着解决这一差距迈出了一步。我们研究了线性函数近似下的分布式多智能体SSP（Dec-MASSP），其中转换动态和成本由线性模型表示。通过应用新颖的基于对称性的论证，我们确定了最优策略的结构。我们的主要贡献是基于为任何智能体数量$n$构建难以学习的实例，为这一设置首次给出了遗憾下界。我们的遗憾下界为$\Omega(\sqrt{K})$，其中$K$是时间段数，突出了Dec-MASSP中固有的学习难度。这些见解阐明了分布式控制的学习复杂性，并可以进一步指导多智能体系统中高效学习算法的设计。

论文及项目相关链接

PDF To appear in 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

Summary
多智能体系统（MAS）在群体机器人和交通路由等应用中至关重要，智能体需要分散控制以达到共同目标。随机最短路径（SSP）问题提供了此类设置中的分散控制的自然框架。尽管SSP中的学习问题在单智能体环境中已被广泛研究，但分散式多智能体变体仍然被较少探索。本研究旨在缩小这一差距，研究线性函数近似下的分散式多智能体SSP（Dec-MASSP），其中过渡动态和成本由线性模型表示。通过应用新颖的对称论证，我们确定了最优策略的结构。我们的主要贡献是基于为任何智能体数量n构建的难以学习的实例，得出此设置中的首个遗憾下界。经过K个阶段的遗憾下界为Ω(√K)，突显了Dec-MASSP中的内在学习难度。这些见解明确了分散控制的复杂性，并可进一步指导多智能体系统中高效学习算法的设计。

Key Takeaways

多智能体系统（MAS）在协调多个智能体完成共同任务中起到关键作用。
随机最短路径（SSP）问题提供了建模分散控制任务的天然框架。
分散式多智能体SSP（Dec-MASSP）在智能体间的协同学习上仍存在较大的探索空间。
研究人员在线性函数近似下探索了Dec-MASSP的最优策略结构。
通过对称论证方法确定了最优策略的结构特点。
研究得出了该领域首个遗憾下界，反映了学习的内在难度。

Agent

2025-11-08 更新

Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems

Promoting Sustainable Web Agents: Benchmarking and Estimating Energy Consumption through Empirical and Theoretical Analysis

Beyond Shortest Path: Agentic Vehicular Routing with Semantic Context

Speed at the Cost of Quality? The Impact of LLM Agent Assistance on Software Development

Post-Training LLMs as Better Decision-Making Agents: A Regret-Minimization Approach

GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents

BAPPA: Benchmarking Agents, Plans, and Pipelines for Automated Text-to-SQL Generation

Learning from Online Videos at Inference Time for Computer-Use Agents

Agentmandering: A Game-Theoretic Framework for Fair Redistricting via Large Language Model Agents

Benchmarking and Studying the LLM-based Agent System in End-to-End Software Development

ArchPilot: A Proxy-Guided Multi-Agent Approach for Machine Learning Engineering

PEFA-AI: Advancing Open-source LLMs for RTL generation using Progressive Error Feedback Agentic-AI

KnowThyself: An Agentic Assistant for LLM Interpretability

ASAP: an Agentic Solution to Auto-optimize Performance of Large-Scale LLM Training

PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels

CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation

Collaboration Dynamics and Reliability Challenges of Multi-Agent LLM Systems in Finite Element Analysis