cs.CL(2024-10-31)

📊 共 33 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (27 🔗9) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (27 篇)

#题目一句话要点标签🔗
1 Constraint Back-translation Improves Complex Instruction Following of Large Language Models 提出约束反向翻译方法,提升大语言模型复杂指令遵循能力 large language model instruction following
2 Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales? 提出CD-CoT方法,提升大语言模型在噪声推理链提示下的鲁棒性 large language model chain-of-thought
3 Blind Spot Navigation in Large Language Model Reasoning with Thought Space Explorer 提出Thought Space Explorer,解决大语言模型推理中的盲点问题 large language model chain-of-thought
4 Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models 针对音频大模型安全性,提出多维度红队测试方法,发现其脆弱性 large language model multimodal
5 Large Language Models for Patient Comments Multi-Label Classification 利用大型语言模型进行患者评论多标签分类,提升医疗反馈分析效率。 large language model chain-of-thought
6 BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments BitStack:一种在可变内存环境中对大语言模型进行任意大小压缩的训练方法。 large language model
7 Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models 提出基于多文档聚合的成员推断攻击方法,成功攻破大型语言模型 large language model
8 'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue 提出DIAEF框架,有效检测多模态长对话中的分布外数据,提升用户体验。 multimodal
9 JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking 提出JudgeRank,利用大语言模型进行推理密集型重排序,提升检索增强生成效果。 large language model
10 IdeaBench: Benchmarking Large Language Models for Research Idea Generation IdeaBench:用于评估大语言模型生成科研想法能力的基准测试框架 large language model
11 LEAF: Learning and Evaluation Augmented by Fact-Checking to Improve Factualness in Large Language Models LEAF:通过事实核查增强学习与评估,提升大型语言模型的事实性 large language model
12 What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective 通过梯度分析揭示LLM快慢思考训练中层级差异 large language model chain-of-thought
13 What is Wrong with Perplexity for Long-context Language Modeling? 提出LongPPL指标与LongCE损失,解决长文本建模中困惑度指标失效问题。 large language model
14 Rethinking Scale: The Efficacy of Fine-Tuned Open-Source LLMs in Large-Scale Reproducible Social Science Research 微调开源LLM:提升大规模可复现社会科学研究的效率与透明度 large language model
15 Schema Augmentation for Zero-Shot Domain Adaptation in Dialogue State Tracking 提出Schema Augmentation,提升零样本对话状态跟踪的领域泛化能力 large language model
16 RSL-SQL: Robust Schema Linking in Text-to-SQL Generation 提出RSL-SQL框架,通过鲁棒模式链接提升Text-to-SQL生成性能。 large language model
17 Exploring the Knowledge Mismatch Hypothesis: Hallucination Propensity in Small Models Fine-tuned on Data from Larger Models 研究表明:小模型微调大模型生成数据易产生知识不匹配,导致幻觉问题加剧 large language model
18 Commonsense Knowledge Editing Based on Free-Text in LLMs 提出DEM方法,用于编辑LLM中基于自由文本的常识知识。 large language model
19 DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios DetectRL:真实场景下大语言模型生成文本检测的基准测试 large language model
20 From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents 通过优化上下文管理,提升LLM驱动的多轮Web导航Agent的泛化能力 large language model
21 RESTOR: Knowledge Recovery in Machine Unlearning RESTOR框架:评估机器学习模型在数据遗忘中的知识恢复能力 large language model
22 Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs 针对阿拉伯文化偏见,对前沿LLM进行红队测试与安全评估 large language model
23 Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language 利用单源高质量机器翻译数据预训练多语言大语言模型,显著提升非英语推理能力。 large language model
24 Language Models can Self-Lengthen to Generate Long Texts 提出Self-Lengthen框架,利用LLM自身能力生成更长文本,无需额外数据或专有模型。 large language model
25 Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction Instruction-Tuning Llama-3-8B用于城市级长期移动预测,性能超越SOTA large language model
26 Pseudo-Conversation Injection for LLM Goal Hijacking 提出伪对话注入以解决大型语言模型目标劫持问题 large language model
27 On Positional Bias of Faithfulness for Long-form Summarization 针对长文本摘要中位置偏差问题,提出评测基准与缓解策略。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
28 LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction LLM4Mat-Bench:用于材料属性预测的大语言模型基准测试 predictive model large language model
29 Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-Play 提出eva:通过非对称自博弈演化对齐,实现LLM后训练的可扩展性,无需额外人工提示。 reinforcement learning DPO large language model
30 Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use 研究语言信息性和多样性对具身强化学习Agent的影响,提升泛化性和适应性。 reinforcement learning
31 Simulating User Agents for Embodied Conversational-AI 提出基于LLM的用户代理,用于模拟具身对话AI交互,降低数据集构建成本。 reinforcement learning large language model
32 SelfCodeAlign: Self-Alignment for Code Generation SelfCodeAlign:一种用于代码生成的自对齐框架,无需人工标注或知识蒸馏。 distillation large language model

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
33 A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-Making MDAgents:一种基于LLM的自适应协作框架,用于提升医疗决策的准确性和效率 MDM large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页