| 1 |
Grounding Generative Planners in Verifiable Logic: A Hybrid Architecture for Trustworthy Embodied AI |
提出VIRF框架,通过逻辑导师与LLM协作,实现具身AI可验证的安全规划。 |
embodied AI large language model |
|
|
| 2 |
Gesturing Toward Abstraction: Multimodal Convention Formation in Collaborative Physical Tasks |
研究人机协作中多模态沟通策略的演化,并提出基于概率模型的抽象概念形成方法。 |
multimodal |
|
|
| 3 |
Scalable Delphi: Large Language Models for Structured Risk Estimation |
提出Scalable Delphi,利用大语言模型实现可扩展的结构化风险评估 |
large language model |
|
|
| 4 |
DeepQuali: Initial results of a study on the use of large language models for assessing the quality of user stories |
DeepQuali:利用大型语言模型评估用户故事质量的初步研究 |
large language model |
|
|
| 5 |
Root Cause Analysis Method Based on Large Language Models with Residual Connection Structures |
提出基于残差连接和大型语言模型的RC-LLM方法,用于微服务架构中的根因分析 |
large language model |
|
|
| 6 |
Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure |
通过因果结构分析,揭示隐式思维链中的动态过程与决策机制 |
chain-of-thought |
|
|
| 7 |
6G-Bench: An Open Benchmark for Semantic Communication and Network-Level Reasoning with Foundation Models in AI-Native 6G Networks |
6G-Bench:面向AI原生6G网络中语义通信和网络级推理的开放基准测试 |
foundation model |
✅ |
|
| 8 |
An Attention Mechanism for Robust Multimodal Integration in a Global Workspace Architecture |
提出一种基于注意力机制的全局工作空间架构,提升多模态融合的噪声鲁棒性。 |
multimodal |
|
|
| 9 |
Large Language Models in Peer-Run Community Behavioral Health Services: Understanding Peer Specialists and Service Users' Perspectives on Opportunities, Risks, and Mitigation Strategies |
探索大型语言模型在同伴互助社区心理健康服务中的应用与风险 |
large language model |
|
|
| 10 |
We Should Separate Memorization from Copyright |
区分记忆与版权:提出一种更符合版权标准的AI模型输出评估方法 |
foundation model |
|
|
| 11 |
CoRefine: Confidence-Guided Self-Refinement for Adaptive Test-Time Compute |
提出CoRefine,利用置信度引导LLM自精炼,降低推理计算成本。 |
large language model |
|
|
| 12 |
Automatic In-Domain Exemplar Construction and LLM-Based Refinement of Multi-LLM Expansions for Query Expansion |
提出一种自动构建领域内示例和基于LLM优化多LLM扩展的查询扩展框架 |
large language model |
|
|
| 13 |
OmniReview: A Large-scale Benchmark and LLM-enhanced Framework for Realistic Reviewer Recommendation |
OmniReview:提出大规模评审推荐基准与LLM增强框架Pro-MMoE,提升评审专家推荐的准确性和可解释性。 |
large language model |
|
|
| 14 |
Whose Name Comes Up? Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation |
LLMScholarBench:基于干预审计LLM学者推荐,揭示干预措施对模型性能的影响。 |
large language model |
|
|
| 15 |
PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition |
PRISM:一种基于增益分解的多智能体推理原则性框架 |
large language model |
|
|
| 16 |
Reinforcement Inference: Leveraging Uncertainty for Self-Correcting Language Model Reasoning |
提出Reinforcement Inference,利用不确定性提升语言模型推理能力,无需重训练。 |
large language model |
|
|
| 17 |
CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform |
CLEAR:一个以知识为中心的船舶轨迹分析平台,利用LLM增强AIS数据分析。 |
large language model |
|
|
| 18 |
From Assistant to Double Agent: Formalizing and Benchmarking Attacks on OpenClaw for Personalized Local AI Agent |
提出PASB框架,评估个性化本地AI助手OpenClaw的安全性 |
large language model |
✅ |
|
| 19 |
On Protecting Agentic Systems' Intellectual Property via Watermarking |
提出AGENTWM,通过水印保护Agentic系统免受模仿攻击 |
large language model |
|
|
| 20 |
SWE Context Bench: A Benchmark for Context Learning in Coding |
SWE-ContextBench:用于评估代码生成中上下文学习能力的新基准 |
large language model |
|
|
| 21 |
Moral Sycophancy in Vision Language Models |
研究视觉语言模型中的道德逢迎现象,揭示模型易受用户意见影响的脆弱性 |
multimodal |
|
|
| 22 |
Toward Formalizing LLM-Based Agent Designs through Structural Context Modeling and Semantic Dynamics Analysis |
提出结构化上下文模型,形式化LLM Agent设计并提升动态任务性能。 |
large language model |
|
|
| 23 |
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design |
G-LNS:基于LLM的生成式大邻域搜索,用于自动启发式设计 |
large language model |
|
|
| 24 |
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger |
提出WMSS弱驱动学习,利用模型历史弱状态指导优化,突破后训练饱和瓶颈。 |
large language model |
|
|
| 25 |
Nexus: Inferring Join Graphs from Metadata Alone via Iterative Low-Rank Matrix Completion |
Nexus:通过迭代低秩矩阵补全,仅从元数据推断连接图 |
large language model |
|
|