发布日期: 2025-02-17

更新日期: 2025-05-14

文章字数: 2.5k

阅读时长: 10 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-02-17 更新

From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework

Authors:Haowen Xu, Xiao-Ying Yu

Developing web-based GIS applications, commonly known as CyberGIS dashboards, for querying and visualizing GIS data in environmental research often demands repetitive and resource-intensive efforts. While Generative AI offers automation potential for code generation, it struggles with complex scientific applications due to challenges in integrating domain knowledge, software engineering principles, and UI design best practices. This paper introduces a knowledge-augmented code generation framework that retrieves software engineering best practices, domain expertise, and advanced technology stacks from a specialized knowledge base to enhance Generative Pre-trained Transformers (GPT) for front-end development. The framework automates the creation of GIS-based web applications (e.g., dashboards, interfaces) from user-defined UI wireframes sketched in tools like PowerPoint or Adobe Illustrator. A novel Context-Aware Visual Prompting method, implemented in Python, extracts layouts and interface features from these wireframes to guide code generation. Our approach leverages Large Language Models (LLMs) to generate front-end code by integrating structured reasoning, software engineering principles, and domain knowledge, drawing inspiration from Chain-of-Thought (CoT) prompting and Retrieval-Augmented Generation (RAG). A case study demonstrates the framework’s capability to generate a modular, maintainable web platform hosting multiple dashboards for visualizing environmental and energy data (e.g., time-series, shapefiles, rasters) from user-sketched wireframes. By employing a knowledge-driven approach, the framework produces scalable, industry-standard front-end code using design patterns such as Model-View-ViewModel (MVVM) and frameworks like React. This significantly reduces manual effort in design and coding, pioneering an automated and efficient method for developing smart city software.

在环境研究中，开发基于网络的地理信息系统（GIS）应用程序，通常称为CyberGIS仪表盘，通常需要重复和资源密集型的努力。虽然生成式人工智能为代码生成提供了自动化潜力，但由于在整合领域知识、软件工程原则和UI设计最佳实践方面的挑战，它在处理复杂的科学应用程序时面临困难。本文介绍了一个知识增强代码生成框架，该框架从专门的知识库中检索软件工程最佳实践、领域知识和高级技术堆栈，以增强前端开发的生成式预训练变压器（GPT）。该框架自动创建基于GIS的Web应用程序（例如仪表板、界面），用户可以在PowerPoint或Adobe Illustrator等工具中绘制自定义的UI草图。一种新型的上下文感知视觉提示方法，用Python实现，可以从这些草图中提取布局和界面特征来指导代码生成。我们的方法利用大型语言模型（LLM）通过整合结构化推理、软件工程原则和领域知识来生成前端代码，这受到思维链提示和检索增强生成的启发。一个案例研究展示了该框架从用户绘制的草图中生成模块化、可维护的Web平台，该平台托管多个仪表盘以可视化环境和能源数据（例如时间序列、形状文件和栅格数据）的能力。通过采用知识驱动的方法，该框架使用设计模式（如Model-View-ViewModel（MVVM））和框架（如React）来生成可扩展的、符合行业标准的前端代码。这显著减少了设计和编码中的手动工作，开创了一种自动化和高效的智能城市软件开发方法。

论文及项目相关链接

PDF

Summary
基于Web的地理信息系统（GIS）应用在环境研究中的数据查询和可视化，即网络GIS仪表盘的开发工作量大且复杂。这篇论文引入了一种知识辅助的代码生成框架，它结合领域知识、软件工程最佳实践和技术栈，以增强生成式预训练模型在前端开发中的应用。该框架可根据用户在PowerPoint或Adobe Illustrator等工具中绘制的用户定义界面原型自动生成基于GIS的Web应用程序代码。其创新的上下文感知视觉提示方法以Python实现，能从原型中提取布局和界面特性来指导代码生成。该方法结合结构推理、软件工程原则和领域知识，激发链式思维和检索增强生成，创建了一个可模块化的网络平台来展示环境和能源数据可视化仪表盘。这种知识驱动的方法显著减少了设计和编码的工作量，为智慧城市软件的开发提供了自动化和高效的方法。

Key Takeaways

CyberGIS仪表盘的开发通常需要大量的资源和时间，而生成式AI具有自动化潜力但面临集成领域知识、软件工程原则和UI设计最佳实践的挑战。
论文引入的知识辅助代码生成框架结合了领域知识、软件工程最佳实践和技术栈，提高了生成式预训练模型在前端开发中的应用效果。
该框架能够根据用户定义的UI草图自动生成GIS基础的Web应用程序（如仪表盘和界面），减少了开发过程中的手动工作。
创新的上下文感知视觉提示方法能够从草图提取布局和界面特性来指导代码生成，体现了设计模式和框架如MVVM和React的应用。
结合结构推理、软件工程原则和领域知识的Large Language Models（LLMs）用于生成前端代码，展现了新颖的自动化技术。
案例研究证明了该框架生成模块化、可维护的Web平台能力，该平台可以托管多个仪表盘以可视化环境和能源数据（如时间序列、形状文件和栅格数据）。

Cool Papers

点此查看论文截图

Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

Authors:Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua

Open-domain dialogue systems have seen remarkable advancements with the development of large language models (LLMs). Nonetheless, most existing dialogue systems predominantly focus on brief single-session interactions, neglecting the real-world demands for long-term companionship and personalized interactions with chatbots. Crucial to addressing this real-world need are event summary and persona management, which enable reasoning for appropriate long-term dialogue responses. Recent progress in the human-like cognitive and reasoning capabilities of LLMs suggests that LLM-based agents could significantly enhance automated perception, decision-making, and problem-solving. In response to this potential, we introduce a model-agnostic framework, the Long-term Dialogue Agent (LD-Agent), which incorporates three independently tunable modules dedicated to event perception, persona extraction, and response generation. For the event memory module, long and short-term memory banks are employed to separately focus on historical and ongoing sessions, while a topic-based retrieval mechanism is introduced to enhance the accuracy of memory retrieval. Furthermore, the persona module conducts dynamic persona modeling for both users and agents. The integration of retrieved memories and extracted personas is subsequently fed into the generator to induce appropriate responses. The effectiveness, generality, and cross-domain capabilities of LD-Agent are empirically demonstrated across various illustrative benchmarks, models, and tasks. The code is released at https://github.com/leolee99/LD-Agent.

随着大型语言模型（LLM）的发展，开放域对话系统已经取得了显著的进步。然而，大多数现有的对话系统主要关注短暂的单一会话互动，忽视了现实世界对长期伴侣关系和与聊天机器人的个性化互动的需求。解决这一现实需求的关键是事件总结和人格管理，它们能够使对话回应更加合理且长期。最近，LLM在类似人类的认知和推理能力方面的进步表明，基于LLM的代理可以显着增强自动化感知、决策和问题解决。针对这一潜力，我们引入了一个模型无关框架，即长期对话代理（LD-Agent），它包括三个独立可调模块，分别用于事件感知、人格提取和响应生成。对于事件内存模块，长短时记忆库分别关注历史会话和正在进行中的会话，同时引入基于主题的检索机制以提高内存检索的准确性。此外，人格模块为用户和代理进行动态人格建模。检索到的记忆和提取的人格的集成随后被输入生成器，以产生适当的回应。LD-Agent的有效性、通用性和跨域能力在各种基准测试、模型和任务中得到了实证证明。代码已发布在https://github.com/leolee99/LD-Agent。

论文及项目相关链接

PDF Accepted to NAACL 2025

Summary：
随着大型语言模型（LLM）的发展，开放域对话系统取得了显著进步。然而，大多数现有对话系统主要关注短暂的单一会话交互，忽视了现实世界对长期伴侣关系和与聊天机器人的个性化交互的需求。为了解决这个问题，我们引入了模型无关框架长期对话代理（LD-Agent），包含三个独立可调模块，分别用于事件感知、人格提取和响应生成。该框架采用长短时记忆库来分别关注历史和当前会话，并引入基于主题的检索机制来提高记忆检索的准确性。此外，人格模块为用户和代理进行动态人格建模。评估和实验证明了LD-Agent的有效性、通用性和跨域能力。

Key Takeaways：