cs.CL(2024-05-31)
📊 共 25 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (16 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (8 🔗3)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation | 提出LLM-ESR框架,利用大语言模型增强长尾序列推荐系统性能 | distillation large language model | ✅ | |
| 18 | Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training | 提出基于动作对比自训练的ACT方法,提升LLM在多轮对话中澄清用户意图的能力。 | policy learning DPO direct preference optimization | ||
| 19 | Direct Alignment of Language Models via Quality-Aware Self-Refinement | 提出质量感知自精炼方法,直接对齐语言模型,提升DPO训练效果 | reinforcement learning RLHF DPO | ||
| 20 | Improving Reward Models with Synthetic Critiques | 提出基于合成评论的奖励模型训练方法,提升数据效率与泛化能力。 | reinforcement learning large language model instruction following | ||
| 21 | Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning | 提出多层次多粒度对比学习框架MMCL,提升口语理解任务性能 | contrastive learning distillation | ||
| 22 | SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales | 提出SaySelf框架,提升LLM细粒度置信度表达能力并生成自反思性理由 | reinforcement learning large language model | ✅ | |
| 23 | Learning to Estimate System Specifications in Linear Temporal Logic using Transformers and Mamba | 提出基于Transformer和Mamba的自回归模型,用于线性时序逻辑公式的系统规约挖掘。 | Mamba | ||
| 24 | Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment | 提出自增强偏好优化(SAPO),无需配对数据对语言模型进行对齐。 | policy learning DPO direct preference optimization | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 25 | UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation | 提出UniBias以揭示和缓解LLM偏见问题 | manipulation large language model |