| 1 |
SemioLLM: Evaluating Large Language Models for Diagnostic Reasoning from Unstructured Clinical Narratives in Epilepsy |
SemioLLM:评估大型语言模型在癫痫诊断中从非结构化临床叙述进行推理的能力 |
large language model chain-of-thought |
|
|
| 2 |
LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation |
LANE:无需调优,对齐大语言模型与在线推荐系统,实现可解释推理生成 |
large language model chain-of-thought |
|
|
| 3 |
SOS! Soft Prompt Attack Against Open-Source Large Language Models |
提出SOS软提示攻击,针对开源大语言模型,实现低成本、非侵入式的安全威胁。 |
large language model |
|
|
| 4 |
Knowledge-based Consistency Testing of Large Language Models |
KonTest:基于知识图谱的大语言模型一致性测试框架 |
large language model |
|
|
| 5 |
Regurgitative Training: The Value of Real Data in Training Large Language Models |
研究表明:使用LLM生成数据进行再训练会显著降低LLM性能 |
large language model |
|
|
| 6 |
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model |
提出R2S框架,利用原始文本生成知识密集型多轮对话,提升大语言模型指令微调效果。 |
large language model |
|
|
| 7 |
Are Large Language Models Consistent over Value-laden Questions? |
评估大语言模型在价值导向问题上的一致性,揭示模型偏见与稳定性。 |
large language model |
|
|
| 8 |
Large Language Models as Evaluators for Scientific Synthesis |
探索大型语言模型在科学综述质量评估中的应用与局限性 |
large language model |
|
|
| 9 |
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset |
提出MC-EIU数据集,用于多模态对话中情感和意图的联合理解。 |
multimodal |
✅ |
|
| 10 |
Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias |
针对孟加拉语大型语言模型,提出社会偏见评估方法与数据集。 |
large language model |
|
|
| 11 |
Investigating Decoder-only Large Language Models for Speech-to-text Translation |
提出基于Decoder-only LLM的语音到文本翻译模型,无需专有数据达到SOTA |
large language model |
|
|
| 12 |
Enhancing Translation Accuracy of Large Language Models through Continual Pre-Training on Parallel Data |
提出基于并行数据持续预训练的大语言模型翻译精度提升方法 |
large language model |
|
|
| 13 |
Social Bias Evaluation for Large Language Models Requires Prompt Variations |
大型语言模型社会偏见评估需考虑提示的多样性 |
large language model |
|
|
| 14 |
CogErgLLM: Exploring Large Language Model Systems Design Perspective Using Cognitive Ergonomics |
提出将认知工效学融入LLM系统设计,提升人机交互安全性与用户满意度 |
large language model |
|
|
| 15 |
ESQA: Event Sequences Question Answering |
ESQA:针对事件序列问答,有效利用LLM并解决长序列和时序数值特征处理难题。 |
large language model TAMP |
|
|
| 16 |
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs |
通过多样化推理链微调,提升LLM在单次推理中的CoT精炼能力 |
large language model chain-of-thought |
✅ |
|
| 17 |
FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering |
提出基于有限状态机的零样本提示方法FSM,提升LLM在多跳问答任务上的推理能力。 |
large language model chain-of-thought |
|
|
| 18 |
VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values |
VIVA:一个基于视觉和人类价值观的决策基准 |
multimodal |
|
|
| 19 |
How Does Quantization Affect Multilingual LLMs? |
揭示量化对多语言LLM的影响:非拉丁语系性能显著下降,人工评估更敏感 |
large language model |
|
|
| 20 |
Let the Code LLM Edit Itself When You Edit the Code |
提出PIE方法,在代码编辑场景下高效更新LLM的KV缓存,解决重编码效率问题。 |
large language model |
|
|
| 21 |
JailbreakHunter: A Visual Analytics Approach for Jailbreak Prompts Discovery from Large-Scale Human-LLM Conversational Datasets |
JailbreakHunter:一种基于可视分析的大规模LLM越狱提示发现方法 |
large language model |
|
|
| 22 |
Improving LLM Abilities in Idiomatic Translation |
提出基于知识库增强的LLM翻译方法,提升成语翻译的信达雅。 |
large language model |
|
|
| 23 |
Collaborative Quest Completion with LLM-driven Non-Player Characters in Minecraft |
利用LLM驱动的Minecraft NPC实现协作任务完成 |
large language model |
|
|
| 24 |
Truth is Universal: Robust Detection of Lies in LLMs |
提出基于激活向量子空间的LLM谎言检测方法,提升鲁棒性与泛化性 |
large language model |
|
|
| 25 |
What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks |
研究工具学习框架的稳定性影响因素,提升LLM在现实应用中的鲁棒性 |
large language model |
|
|
| 26 |
Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text |
提出RoBERTa-BiLSTM分类器,用于检测多领域AI生成文本,应对潜在滥用风险。 |
large language model |
|
|
| 27 |
52B to 1T: Lessons Learned via Tele-FLM Series |
Tele-FLM系列:从52B到1T参数LLM的扩展经验与实践 |
large language model |
|
|
| 28 |
LLM Internal States Reveal Hallucination Risk Faced With a Query |
通过分析LLM内部状态评估其在面对查询时的幻觉风险 |
large language model |
|
|
| 29 |
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory |
Cactus:构建基于认知行为疗法的心理咨询对话数据集 |
large language model |
|
|
| 30 |
ALTER: Augmentation for Large-Table-Based Reasoning |
ALTER框架通过增强查询和表格数据,提升LLM在大表格推理中的性能。 |
large language model |
|
|
| 31 |
From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks |
揭示LLM中反学习的涟漪效应,提升针对越狱攻击的防御能力 |
large language model |
✅ |
|
| 32 |
Efficient Training of Language Models with Compact and Consistent Next Token Distributions |
提出紧凑一致的下一Token分布,加速并提升语言模型训练效率。 |
large language model |
|
|
| 33 |
MentalAgora: A Gateway to Advanced Personalized Care in Mental Health through Multi-Agent Debating and Attribute Control |
MentalAgora:基于多智能体辩论与属性控制的个性化心理健康支持系统 |
large language model |
|
|