| 1 |
FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs |
提出FESTA,通过功能等效采样评估多模态LLM的置信度 |
large language model multimodal |
|
|
| 2 |
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning |
提出SalaMAnder,基于Shapley值评估CoT推理中数学表达式的贡献度,并优化提示构建。 |
large language model chain-of-thought |
|
|
| 3 |
Zero-Shot Human Mobility Forecasting via Large Language Model with Hierarchical Reasoning |
提出ZHMF框架,利用层级推理大语言模型实现零样本人类移动预测 |
large language model |
|
|
| 4 |
Roundtable Policy: Improving Scientific Reasoning and Narratives through Confidence-Weighted Consensus of LLMs |
提出Roundtable Policy,通过LLM置信度加权共识提升科学推理和叙事能力 |
large language model chain-of-thought |
|
|
| 5 |
NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning Abilities |
提出NUMINA基准,用于评估多模态LLM在3D室内场景中的数值推理能力 |
large language model multimodal |
✅ |
|
| 6 |
ACCeLLiuM: Supervised Fine-Tuning for Automated OpenACC Pragma Generation |
ACCeLLiuM:用于自动生成OpenACC编译指导语句的监督式微调方法 |
large language model |
|
|
| 7 |
Design and Development of an Intelligent LLM-based LDAP Honeypot |
提出基于LLM的智能LDAP蜜罐,提升网络安全防御的适应性和易用性 |
large language model |
|
|