| 1 |
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation |
提出HM-RAG,一种层级多智能体多模态检索增强生成框架,用于复杂查询下的知识合成。 |
large language model multimodal |
✅ |
|
| 2 |
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance |
Mantra-14B:通过文化和本地知识增强LLM多语言能力并提升原生性能 |
large language model |
|
|
| 3 |
ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model |
ClinicalGPT-R1:利用大型语言模型提升通用疾病诊断的推理能力 |
large language model |
✅ |
|
| 4 |
Can the capability of Large Language Models be described by human ability? A Meta Study |
通过对比LLM与人类能力,探究LLM能力是否能用人类能力指标描述 |
large language model |
|
|
| 5 |
Read Before You Think: Mitigating LLM Comprehension Failures with Step-by-Step Reading |
提出Step-by-Step Reading (SSR)系列提示方法,提升LLM在复杂推理任务中的问题理解能力。 |
large language model chain-of-thought |
|
|
| 6 |
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance |
研究短路径提示对LLM推理能力的影响,并提出指令引导和微调方法提升鲁棒性 |
large language model chain-of-thought |
|
|
| 7 |
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution |
提出Syzygy of Thoughts (SoT)框架,利用极小自由分解提升LLM的CoT推理能力。 |
large language model chain-of-thought |
✅ |
|
| 8 |
Myanmar XNLI: Building a Dataset and Exploring Low-resource Approaches to Natural Language Inference with Myanmar |
构建缅甸语XNLI数据集并探索低资源自然语言推理方法 |
large language model |
|
|
| 9 |
How new data permeates LLM knowledge and how to dilute it |
研究LLM学习新知识的泛化与幻觉现象,并提出数据增强和更新剪枝方法以提升知识特异性。 |
large language model |
✅ |
|
| 10 |
Meta-Evaluating Local LLMs: Rethinking Performance Metrics for Serious Games |
针对严肃游戏,提出元评估方法以评估本地LLM的性能指标 |
large language model |
|
|
| 11 |
HalluShift: Measuring Distribution Shifts towards Hallucination Detection in LLMs |
HalluShift:通过测量LLM中的分布偏移来检测幻觉现象 |
large language model |
✅ |
|
| 12 |
Span-level Emotion-Cause-Category Triplet Extraction with Instruction Tuning LLMs and Data Augmentation |
提出基于指令调优LLM和数据增强的Span级情感-原因-类别三元组抽取方法 |
large language model |
✅ |
|
| 13 |
AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents |
AgentA/B:利用交互式LLM Agent实现自动化和可扩展的Web A/B测试 |
large language model |
|
|
| 14 |
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability |
利用推理模型答案提升非推理模型能力 |
large language model |
|
|
| 15 |
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender |
AdaSteer:提出自适应激活Steering方法,增强LLM的越狱防御能力。 |
large language model |
|
|
| 16 |
UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents |
UXAgent:利用LLM Agent模拟Web设计可用性测试,辅助UX研究。 |
large language model |
|
|
| 17 |
Measuring LLM Novelty As The Frontier Of Original And High-Quality Output |
提出一种新颖性指标,通过平衡原创性和质量来评估LLM的创造性输出能力。 |
large language model |
|
|
| 18 |
On Language Models' Sensitivity to Suspicious Coincidences |
研究发现语言模型在零样本学习中对可疑巧合不敏感,但可通过提示工程增强 |
chain-of-thought |
|
|