| 1 |
A Multifaceted Analysis of Negative Bias in Large Language Models through the Lens of Parametric Knowledge |
通过参数知识视角分析大语言模型中的负偏见,揭示其内在影响因素。 |
large language model chain-of-thought |
|
|
| 2 |
Multimodal Peer Review Simulation with Actionable To-Do Recommendations for Community-Aware Manuscript Revisions |
提出多模态同行评审模拟系统以提升论文修订质量 |
large language model multimodal |
|
|
| 3 |
Additive Large Language Models for Semi-Structured Text |
提出CALM框架,解决LLM在半结构化临床文本分类中的可解释性问题。 |
large language model |
|
|
| 4 |
Automata-Based Steering of Large Language Models for Diverse Structured Generation |
提出基于自动机的LLM引导方法,提升结构化生成任务的多样性 |
large language model |
|
|
| 5 |
Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis |
针对水力发电监管信息抽取,系统分析了开放权重大型语言模型的性能与资源消耗。 |
large language model |
|
|
| 6 |
Prompt-Based Value Steering of Large Language Models |
提出一种基于Prompt的大语言模型价值观引导方法,无需模型微调。 |
large language model |
|
|
| 7 |
Evaluating Large Language Models on Rare Disease Diagnosis: A Case Study using House M.D |
利用《豪斯医生》数据集评估大型语言模型在罕见病诊断中的能力 |
large language model |
|
|
| 8 |
Random Text, Zipf's Law, Critical Length,and Implications for Large Language Models |
提出基于随机文本模型的Zipf定律解释,为语言模型统计特性提供零模型。 |
large language model |
|
|
| 9 |
LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models |
LaoBench:首个大规模多维度老挝语基准评测,用于评估大语言模型的理解与推理能力 |
large language model |
|
|
| 10 |
From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models |
揭示工具增强语言模型中的工具诱导推理短视问题 |
large language model |
✅ |
|
| 11 |
Enhancing Meme Emotion Understanding with Multi-Level Modality Enhancement and Dual-Stage Modal Fusion |
提出MemoDetector,通过多层次模态增强和双阶段融合提升Meme情感理解。 |
large language model multimodal |
✅ |
|
| 12 |
Expert-Guided Prompting and Retrieval-Augmented Generation for Emergency Medical Service Question Answering |
提出Expert-CoT和ExpertRAG,解决急救医学问答中领域知识不足的问题 |
large language model chain-of-thought |
|
|
| 13 |
Can LLMs Detect Their Own Hallucinations? |
提出基于CoT的框架,评估LLM自检幻觉能力,提升幻觉检测准确率 |
large language model chain-of-thought |
|
|
| 14 |
AV-Dialog: Spoken Dialogue Models with Audio-Visual Input |
AV-Dialog:提出一种利用音视频输入的多模态对话框架,提升噪声环境下的对话质量。 |
multimodal |
|
|
| 15 |
Improving LLM's Attachment to External Knowledge In Dialogue Generation Tasks Through Entity Anonymization |
提出实体匿名化方法,提升LLM在对话生成任务中对外部知识的利用率 |
large language model |
|
|
| 16 |
InData: Towards Secure Multi-Step, Tool-Based Data Analysis |
InData:面向安全的多步骤、基于工具的数据分析数据集与基准 |
large language model |
|
|
| 17 |
PRSM: A Measure to Evaluate CLIP's Robustness Against Paraphrases |
提出PRSM指标,评估CLIP模型在释义变换下的鲁棒性,揭示潜在偏见。 |
multimodal |
|
|
| 18 |
Towards Autoformalization of LLM-generated Outputs for Requirement Verification |
利用LLM自动形式化LLM生成内容,用于需求验证 |
large language model |
|
|
| 19 |
M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text |
M-DAIGT:多领域AI生成文本检测共享任务与大规模基准数据集 |
large language model |
|
|
| 20 |
Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data |
通过结构化定义和分割提升LLM在印度法律推理任务中的表现 |
large language model |
|
|
| 21 |
KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement |
KGQuest:提出一种基于知识图谱和LLM优化的模板驱动QA生成方法 |
large language model |
|
|
| 22 |
Automated Analysis of Learning Outcomes and Exam Questions Based on Bloom's Taxonomy |
基于Bloom分类法的考试题目与学习成果自动分析方法研究 |
large language model |
|
|
| 23 |
Community-Aligned Behavior Under Uncertainty: Evidence of Epistemic Stance Transfer in LLMs |
提出一种框架,用于评估LLM在不确定性下是否表现出与特定社区一致的行为模式。 |
large language model |
|
|
| 24 |
Analysing Personal Attacks in U.S. Presidential Debates |
提出基于Transformer的框架,用于分析美国总统辩论中的人身攻击。 |
large language model |
|
|
| 25 |
PIRA: Preference-Oriented Instruction-Tuned Reward Models with Dual Aggregation |
PIRA:提出双重聚合的偏好导向指令微调奖励模型 |
large language model |
|
|