| 1 |
Beyond Retrieval: Improving Evidence Quality for LLM-based Multimodal Fact-Checking |
Aletheia:提升证据质量,增强LLM在多模态事实核查中的表现 |
large language model multimodal |
|
|
| 2 |
Can Large Language Models Predict Parallel Code Performance? |
利用大型语言模型预测并行GPU代码性能,无需硬件剖析 |
large language model |
|
|
| 3 |
LogiDebrief: A Signal-Temporal Logic based Automated Debriefing Approach with Large Language Models Integration |
LogiDebrief:结合时序逻辑与大语言模型的自动化9-1-1呼叫评估框架 |
large language model |
|
|
| 4 |
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents |
OSUniverse:用于多模态GUI导航AI代理的基准测试平台 |
multimodal |
✅ |
|
| 5 |
Validating the Effectiveness of a Large Language Model-based Approach for Identifying Children's Development across Various Free Play Settings in Kindergarten |
提出基于大语言模型的儿童自由玩耍场景发展能力评估方法 |
large language model |
|
|
| 6 |
Synthline: A Product Line Approach for Synthetic Requirements Engineering Data Generation using Large Language Models |
Synthline:一种基于产品线的大语言模型合成需求工程数据生成方法 |
large language model |
|
|
| 7 |
LlamaFirewall: An open source guardrail system for building secure AI agents |
LlamaFirewall:用于构建安全AI Agent的开源安全防护系统 |
large language model chain-of-thought |
|
|
| 8 |
AI-Driven Scholarly Peer Review via Persistent Workflow Prompting, Meta-Prompting, and Meta-Reasoning |
提出持久工作流提示(PWP)方法,提升LLM在学术同行评议中的表现 |
large language model multimodal |
|
|
| 9 |
Capability-Driven Skill Generation with LLMs: A RAG-Based Approach for Reusing Existing Libraries and Interfaces |
提出基于RAG的LLM能力驱动技能生成方法,复用现有库和接口 |
large language model |
|
|
| 10 |
Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving |
Prism:释放GPU共享潜力,实现多LLM服务的高性价比 |
large language model |
|
|
| 11 |
Binding threshold units with artificial oscillatory neurons |
提出Hopfield-Kuramoto关联记忆模型,结合阈值单元与振荡神经元,实现低秩权重校正。 |
large language model |
|
|
| 12 |
am-ELO: A Stable Framework for Arena-based LLM Evaluation |
提出am-ELO,一个基于竞技场的稳定LLM评估框架,解决ELO系统的不稳定性问题。 |
large language model |
|
|
| 13 |
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents |
提出LLM智能体认知增强框架,弥合程序记忆与复杂环境适应性差距 |
large language model |
|
|
| 14 |
Graph Drawing for LLMs: An Empirical Evaluation |
研究图布局对LLM图任务性能的影响,优化视觉模态输入 |
large language model |
|
|
| 15 |
A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning |
提出Hashgraph启发的共识机制,提升多模型推理的可靠性 |
large language model |
|
|
| 16 |
STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game |
STORY2GAME:利用大型语言模型生成交互式小说游戏,实现故事、世界和游戏逻辑的自动构建。 |
large language model |
|
|
| 17 |
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection |
提出基于领域对抗训练的语音心理健康检测方法,缓解性别偏见。 |
foundation model |
|
|
| 18 |
RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation |
RAG-MCP:通过检索增强生成缓解LLM工具选择中的Prompt膨胀问题 |
large language model |
|
|
| 19 |
Accelerating Evolution: Integrating PSO Principles into Real-Coded Genetic Algorithm Crossover |
提出一种受粒子群优化启发的交叉算子PSOX,加速实数编码遗传算法的收敛。 |
multimodal |
|
|
| 20 |
DocSpiral: A Platform for Integrated Assistive Document Annotation through Human-in-the-Spiral |
DocSpiral平台:通过人机协同循环加速图像文档结构化标注。 |
large language model |
|
|
| 21 |
Patterns and Mechanisms of Contrastive Activation Engineering |
对比激活工程(CAE)调控大语言模型,但存在分布外失效、易受攻击等问题。 |
large language model |
|
|
| 22 |
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering |
提出基于混沌工程的框架,提升LLM多智能体系统在真实环境下的鲁棒性。 |
large language model |
|
|