发布日期: 2025-09-18

更新日期: 2025-10-07

文章字数: 18.4k

阅读时长: 74 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-09-18 更新

Scaling Agents via Continual Pre-training

Authors:Liangcai Su, Zhen Zhang, Guangyu Li, Zhuo Chen, Chenxi Wang, Maojia Song, Xinyu Wang, Kuan Li, Jialong Wu, Xuanzhong Chen, Zile Qiao, Zhongwang Zhang, Huifeng Yin, Shihao Cai, Runnan Fang, Zhengwei Tao, Wenbiao Yin, Chenxiong Qian, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou

Large language models (LLMs) have evolved into agentic systems capable of autonomous tool use and multi-step reasoning for complex problem-solving. However, post-training approaches building upon general-purpose foundation models consistently underperform in agentic tasks, particularly in open-source implementations. We identify the root cause: the absence of robust agentic foundation models forces models during post-training to simultaneously learn diverse agentic behaviors while aligning them to expert demonstrations, thereby creating fundamental optimization tensions. To this end, we are the first to propose incorporating Agentic Continual Pre-training (Agentic CPT) into the deep research agents training pipeline to build powerful agentic foundational models. Based on this approach, we develop a deep research agent model named AgentFounder. We evaluate our AgentFounder-30B on 10 benchmarks and achieve state-of-the-art performance while retains strong tool-use ability, notably 39.9% on BrowseComp-en, 43.3% on BrowseComp-zh, and 31.5% Pass@1 on HLE.

大型语言模型（LLM）已经进化成能够进行自主工具使用和多步骤推理以解决复杂问题的代理系统。然而，基于通用基础模型的后续训练方法在代理任务上的表现一直不佳，特别是在开源实现中。我们找到了根本原因：缺乏稳健的代理基础模型迫使模型在后续训练过程中同时学习多种代理行为，同时将它们与专家演示对齐，从而产生了基本的优化张力。为此，我们首次提出将代理持续预训练（Agentic CPT）纳入深度研究代理训练管道，以构建强大的代理基础模型。基于这种方法，我们开发了一个名为AgentFounder的深度研究代理模型。我们在10个基准测试上对AgentFounder-30B进行了评估，实现了卓越的性能，同时保持了强大的工具使用能力，特别是在BrowseComp-en上达到39.9%，BrowseComp-zh上达到43.3%，HLE上Pass@1达到31.5%。

Summary

大型语言模型进化为具有自主工具使用和复杂问题多步推理能力的代理系统。然而，基于通用基础模型的后续训练方法，在代理任务中的表现一直不尽人意，特别是在开源实现中。问题的根源在于缺乏稳健的代理基础模型，这使得模型在后续训练时需要同时学习多种代理行为并使其与专家演示对齐，从而产生基本的优化紧张。为此，我们首次提出在深度研究代理训练管道中融入代理持续预训练（Agentic CPT），以构建强大的代理基础模型。基于此方法，我们开发了一款名为AgentFounder的深度研究代理模型。我们在10个基准测试上对AgentFounder-30B进行了评估，取得了卓越的性能，同时保持了强大的工具使用能力，特别是在BrowseComp-en上达到39.9%，BrowseComp-zh上达到43.3%，HLE上Pass@1达到31.5%。

Key Takeaways

大型语言模型（LLMs）已进化为具有自主工具使用和复杂问题多步推理能力的代理系统。
基于通用基础模型的后续训练方法，在代理任务中的表现欠佳，其根本原因在于缺乏稳健的代理基础模型。
代理持续预训练（Agentic CPT）被首次融入深度研究代理训练，以构建强大的代理基础模型。
提出的AgentFounder模型在多个基准测试上表现出卓越性能。
AgentFounder模型在工具使用能力方面表现突出。
AgentFounder在BrowseComp-en、BrowseComp-zh和HLE等任务上的性能分别达到39.9%、43.3%和31.5%的优异表现。
这些进展表明，通过结合代理持续预训练，可以在代理任务中显著提高模型的性能。

Agent

2025-09-18 更新

Scaling Agents via Continual Pre-training

WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

xOffense: An AI-driven autonomous penetration testing framework with offensive knowledge-enhanced LLMs and multi agent systems

HLSMAC: A New StarCraft Multi-Agent Challenge for High-Level Strategic Decision-Making

Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use

DeltaHedge: A Multi-Agent Framework for Portfolio Options Optimization

EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer

Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents

Finite-Agent Stochastic Differential Games on Large Graphs: II. Graph-Based Architectures

Agentic Lybic: Multi-Agent Execution System with Tiered Reasoning and Orchestration

Auditable Early Stopping for Agentic Routing: Ledger-Verified Run-Wise Certificates under Local DP

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

AMAZe: A Multi-Agent Zero-shot Index Advisor for Relational Databases

Breaking Single-Tester Limits: Multi-Agent LLMs for Multi-User Feature Testing

Small Language Models are the Future of Agentic AI

HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation

TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation

Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework