| 1 |
A Modular and Multimodal Generative AI Framework for Urban Building Energy Data: Generating Synthetic Homes |
提出模块化多模态生成AI框架,用于生成城市建筑能源数据,合成住宅信息。 |
multimodal |
|
|
| 2 |
Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution |
Auras:通过解耦感知-生成和异步流水线执行提升具身智能体性能 |
embodied AI |
|
|
| 3 |
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering |
提出LoCoBench,用于评估长上下文LLM在复杂软件工程中的能力。 |
large language model |
✅ |
|
| 4 |
Quality Assessment of Tabular Data using Large Language Models and Code Generation |
提出基于大语言模型和代码生成的表格数据质量评估框架 |
large language model |
|
|
| 5 |
On Integrating Large Language Models and Scenario-Based Programming for Improving Software Reliability |
结合大语言模型与场景编程提升软件可靠性 |
large language model |
|
|
| 6 |
DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large Language Models |
提出DP-FedLoRA,增强设备端LLM联邦微调的隐私保护。 |
large language model |
|
|
| 7 |
Vibe Check: Understanding the Effects of LLM-Based Conversational Agents' Personality and Alignment on User Perceptions in Goal-Oriented Tasks |
研究LLM对话Agent人格表达与用户匹配度对目标导向任务用户感知的影响 |
large language model |
|
|
| 8 |
LLMs as Agentic Cooperative Players in Multiplayer UNO |
利用LLM作为UNO多人游戏中具有能动性的合作玩家 |
large language model |
|
|
| 9 |
Towards a Common Framework for Autoformalization |
提出自动形式化通用框架,促进不同领域AI系统交叉融合 |
large language model |
|
|
| 10 |
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs |
揭示LLM长程执行能力:单步精度提升带来任务完成长度的指数级增长 |
large language model |
|
|
| 11 |
TORSO: Template-Oriented Reasoning Towards General Tasks |
提出TORSO:面向模板推理,无需人工样本即可提升LLM在通用任务上的表现 |
large language model |
|
|
| 12 |
Towards Adaptive ML Benchmarks: Web-Agent-Driven Construction, Domain Expansion, and Metric Optimization |
提出TAM Bench,一个基于Web Agent驱动的自适应机器学习基准,用于评估LLM在端到端ML任务中的能力。 |
large language model |
|
|
| 13 |
LightAgent: Production-level Open-source Agentic AI Framework |
提出LightAgent:一个生产级开源Agentic AI框架,旨在简化多智能体系统部署。 |
large language model |
✅ |
|
| 14 |
Jupiter: Enhancing LLM Data Analysis Capabilities via Notebook and Inference-Time Value-Guided Search |
Jupiter:通过Notebook和推理时值引导搜索增强LLM数据分析能力 |
large language model |
✅ |
|
| 15 |
Character-Level Perturbations Disrupt LLM Watermarks |
提出基于字符级扰动的LLM水印移除攻击,揭示现有水印方案的脆弱性 |
large language model |
|
|
| 16 |
Towards Confidential and Efficient LLM Inference with Dual Privacy Protection |
CMIF:面向LLM推理的双重隐私保护框架,兼顾效率与安全性 |
large language model |
|
|
| 17 |
Strategic Tradeoffs Between Humans and AI in Multi-Agent Bargaining |
对比人类、LLM和贝叶斯智能体在多智能体议价中的策略权衡 |
large language model |
|
|