cs.CL(2024-05-26)
📊 共 16 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (10 🔗4)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Multi-Reference Preference Optimization for Large Language Models | 提出多参考偏好优化(MRPO)算法,提升大语言模型对人类意图的对齐能力 | reinforcement learning preference learning DPO | ||
| 12 | Triple Preference Optimization: Achieving Better Alignment using a Single Step Optimization | 提出三重偏好优化(TPO),通过单步优化提升LLM的推理和指令遵循能力。 | reinforcement learning preference learning RLHF | ||
| 13 | M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions | 提出M-RAG多分区检索增强生成框架,提升LLM在多任务上的性能 | reinforcement learning large language model | ||
| 14 | RLSF: Fine-tuning LLMs via Symbolic Feedback | RLSF:通过符号反馈微调大语言模型,提升领域推理和逻辑对齐能力 | reinforcement learning large language model | ||
| 15 | Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity | 提出AugCon,自动生成多粒度上下文驱动的SFT数据,提升LLM微调效果。 | contrastive learning large language model | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 16 | MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations | 提出MentalManip数据集,用于细粒度分析对话中的精神操控现象 | manipulation |