| 1 |
How Multimodal Large Language Models Support Access to Visual Information: A Diary Study With Blind and Low Vision People |
研究多模态大语言模型如何辅助视障人士获取视觉信息,揭示其在实际应用中的挑战与机遇。 |
large language model multimodal |
|
|
| 2 |
BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents |
提出BrowseComp-$V^3$多模态浏览Agent基准,解决现有基准在复杂性、可访问性和评估粒度上的局限性。 |
large language model multimodal |
|
|
| 3 |
TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design |
TriGen:基于软硬件协同设计的端到端大语言模型加速NPU架构 |
large language model |
|
|
| 4 |
Assessing Spear-Phishing Website Generation in Large Language Model Coding Agents |
评估大型语言模型编码智能体生成鱼叉式网络钓鱼网站的能力 |
large language model |
|
|
| 5 |
RQ-GMM: Residual Quantized Gaussian Mixture Model for Multimodal Semantic Discretization in CTR Prediction |
提出RQ-GMM,用于CTR预测中多模态语义离散化,提升点击率。 |
multimodal |
|
|
| 6 |
Artic: AI-oriented Real-time Communication for MLLM Video Assistant |
Artic:面向MLLM视频助手的AI实时通信框架,提升准确率并降低延迟 |
large language model multimodal |
✅ |
|
| 7 |
Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding |
Protect$^*$: 提出神经符号框架,通过可控的逆合成分析指导LLM生成化学反应路径。 |
large language model |
|
|
| 8 |
AI Agents for Inventory Control: Human-LLM-OR Complementarity |
提出人-LLM-OR协同的库存控制AI Agent,提升复杂场景下的决策性能 |
large language model |
|
|
| 9 |
Arming Data Agents with Tribal Knowledge |
Tk-Boost:利用部落知识增强NL2SQL数据代理,提升查询准确性 |
large language model |
|
|
| 10 |
Asynchronous Verified Semantic Caching for Tiered LLM Architectures |
Krites:异步验证语义缓存,提升分层LLM架构静态缓存覆盖率 |
large language model |
|
|
| 11 |
Buy versus Build an LLM: A Decision Framework for Governments |
提出决策框架以帮助政府选择LLM的购买或构建策略 |
large language model |
|
|
| 12 |
G2CP: A Graph-Grounded Communication Protocol for Verifiable and Efficient Multi-Agent Reasoning |
提出G2CP图谱通信协议,解决多智能体系统中的语义漂移和幻觉问题 |
large language model |
|
|
| 13 |
Knowledge-Based Design Requirements for Generative Social Robots in Higher Education |
针对高等教育中生成式社交机器人,提出基于知识的设计需求框架 |
large language model |
|
|
| 14 |
Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents |
CogRouter:为LLM Agent设计认知深度自适应框架,提升效率与性能。 |
large language model |
|
|
| 15 |
TensorCommitments: A Lightweight Verifiable Inference for Language Models |
TensorCommitments:一种轻量级的语言模型可验证推理方案 |
large language model |
|
|
| 16 |
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics |
GeoAgent:通过强化地理特征学习在任意地点进行地理定位 |
chain-of-thought |
|
|