| 1 |
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection |
ErrorRadar:通过错误检测评估多模态大语言模型在复杂数学推理中的能力 |
large language model multimodal |
|
|
| 2 |
FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering |
提出金融领域多语言多模态问答基准FAMMA,挑战LLM的复杂推理能力。 |
large language model multimodal |
✅ |
|
| 3 |
Can LLMs Improve Multimodal Fact-Checking by Asking Relevant Questions? |
提出LRQ-FACT框架,利用LLM生成相关问题以提升多模态事实核查效果 |
large language model multimodal |
|
|
| 4 |
Diagnosing Robotics Systems Issues with Large Language Models |
利用大语言模型诊断机器人系统问题,实现高效根因分析 |
large language model |
|
|
| 5 |
Leveraging Large Language Models for Suicide Detection on Social Media with Limited Labels |
利用大语言模型和有限标签进行社交媒体自杀检测 |
large language model |
✅ |
|
| 6 |
Revisiting In-context Learning Inference Circuit in Large Language Models |
提出ICL推理电路模型,解释并统一大型语言模型中的上下文学习现象。 |
large language model |
|
|
| 7 |
Mitigating Hallucinations Using Ensemble of Knowledge Graph and Vector Store in Large Language Models to Enhance Mental Health Support |
利用知识图谱和向量存储集成,缓解大语言模型在心理健康支持中的幻觉问题 |
large language model |
|
|
| 8 |
Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis |
提出知识引导的动态模态注意力融合框架KuDA,解决多模态情感分析中模态主导性动态变化问题。 |
multimodal |
|
|
| 9 |
SafeLLM: Domain-Specific Safety Monitoring for Large Language Models: A Case Study of Offshore Wind Maintenance |
提出安全监控方法以解决海上风电维护中的风险问题 |
large language model |
|
|
| 10 |
On the Reliability of Large Language Models to Misinformed and Demographically-Informed Prompts |
评估大型语言模型在气候变化和心理健康领域对错误信息和人口统计学信息的可靠性 |
large language model |
|
|
| 11 |
Control Large Language Models via Divide and Conquer |
提出分而治之策略,提升大型语言模型在词汇约束生成任务中的控制能力 |
large language model |
|
|
| 12 |
ProtoMed-LLM: An Automatic Evaluation Framework for Large Language Models in Medical Protocol Formulation |
ProtoMed-LLM:用于评估医学协议生成中大型语言模型的自动化框架 |
large language model |
|
|
| 13 |
Lens: Rethinking Multilingual Enhancement for Large Language Models |
Lens:通过重塑内部语言表征空间增强大型语言模型的多语言能力 |
large language model |
|
|
| 14 |
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model |
ReTok:通过替换分词器提升大语言模型的表征效率 |
large language model |
|
|
| 15 |
Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information |
提出Wrong-of-Thought以解决LLMs推理中的错误信息问题 |
large language model chain-of-thought |
|
|
| 16 |
Core Knowledge Deficits in Multi-Modal Language Models |
揭示多模态语言模型在核心认知能力上的缺陷,并提出概念攻击评估方法。 |
large language model |
|
|
| 17 |
DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination |
DAMRO:通过深入LVLM的注意力机制来减少物体幻觉 |
large language model |
✅ |
|
| 18 |
Evaluation of Code LLMs on Geospatial Code Generation |
构建地理空间代码生成评测基准,评估并提升LLM在此领域的应用能力 |
large language model |
|
|
| 19 |
How Does the Disclosure of AI Assistance Affect the Perceptions of Writing? |
研究AI辅助写作信息披露对写作质量感知的影响 |
large language model |
|
|
| 20 |
Toward Secure Tuning: Mitigating Security Risks from Instruction Fine-Tuning |
提出SWAT安全调优策略,缓解指令微调中大语言模型的安全风险 |
large language model |
|
|
| 21 |
RevMUX: Data Multiplexing with Reversible Adapters for Efficient LLM Batch Inference |
提出RevMUX:利用可逆适配器的数据复用框架,提升LLM批量推理效率 |
large language model |
|
|
| 22 |
Fine-Grained Prediction of Reading Comprehension from Eye Movements |
提出多模态模型,利用眼动数据预测阅读理解的细粒度表现 |
multimodal |
|
|
| 23 |
Inference Scaling for Long-Context Retrieval Augmented Generation |
针对长文本RAG,提出推理计算扩展方法,优化知识利用并显著提升性能。 |
large language model |
|
|