| 1 |
InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning |
提出InPhyRe基准,揭示大型多模态模型在归纳物理推理上的不足 |
multimodal |
|
|
| 2 |
WALL: A Web Application for Automated Quality Assurance using Large Language Models |
WALL:利用大型语言模型实现自动化代码质量保证的Web应用 |
large language model |
|
|
| 3 |
Smart Trial: Evaluating the Use of Large Language Models for Recruiting Clinical Trial Participants via Social Media |
利用大型语言模型进行社交媒体临床试验招募:提出TRIALQA数据集并进行基准测试。 |
large language model |
|
|
| 4 |
Tackling One Health Risks: How Large Language Models are leveraged for Risk Negotiation and Consensus-building |
利用大型语言模型进行风险协商和共识构建,应对“同一个健康”风险 |
large language model |
|
|
| 5 |
DOCUEVAL: An LLM-based AI Engineering Tool for Building Customisable Document Evaluation Workflows |
DOCUEVAL:基于LLM的可定制文档评估工作流AI工程工具 |
large language model foundation model |
|
|
| 6 |
LLM in the Middle: A Systematic Review of Threats and Mitigations to Real-World LLM-based Systems |
系统性分析LLM应用中的威胁与防御,为安全部署提供指导 |
large language model |
|
|
| 7 |
Abduct, Act, Predict: Scaffolding Causal Inference for Automated Failure Attribution in Multi-Agent Systems |
提出A2P框架,通过因果推理提升多智能体系统故障归因精度 |
large language model |
✅ |
|
| 8 |
GenAI Voice Mode in Programming Education |
探索GenAI语音模式在编程教育中的应用,解决新手程序员可访问性问题 |
multimodal |
|
|
| 9 |
The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI Symbiosis |
揭示LLM中隐含的道德偏见,探索人机共生的未来 |
large language model |
|
|
| 10 |
Generating Energy-Efficient Code via Large-Language Models -- Where are we now? |
评估LLM生成代码的能效:与人类专家代码的对比分析 |
large language model |
|
|
| 11 |
Securing LLM-Generated Embedded Firmware through AI Agent-Driven Validation and Patching |
提出AI Agent驱动的验证与修补方法,保障LLM生成嵌入式固件安全 |
large language model |
|
|
| 12 |
LLM Bazaar: A Service Design for Supporting Collaborative Learning with an LLM-Powered Multi-Party Collaboration Infrastructure |
提出LLM Bazaar以支持多方协作学习 |
large language model |
|
|