cs.AI(2025-02-03)

📊 共 9 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (2 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 PSSD: Making Large Language Models Self-denial via Human Psyche Structure PSSD:通过人类心理结构实现大语言模型的自我否定,提升推理准确性 large language model
2 From Divergence to Consensus: Evaluating the Role of Large Language Models in Facilitating Agreement through Adaptive Strategies 提出基于LLM的自适应协商框架,促进群体决策达成共识 large language model
3 Skewed Memorization in Large Language Models: Quantification and Decomposition 量化并分解大语言模型中的倾斜记忆现象,揭示其与训练数据的关系 large language model
4 DeepRAG: Thinking to Retrieve Step by Step for Large Language Models DeepRAG:提出一种基于马尔可夫决策过程的检索增强生成框架,提升大语言模型的推理能力。 large language model
5 VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos 提出VideoRAG,用于处理和理解超长上下文视频的检索增强生成框架 large language model
6 Learning to Generate Unit Tests for Automated Debugging UTGen:学习生成单元测试以辅助LLM自动化调试 large language model
7 Picky LLMs and Unreliable RMs: An Empirical Study on Safety Alignment after Instruction Tuning 揭示指令微调后大语言模型安全性下降问题,并分析奖励模型在安全对齐中的局限性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
8 TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning TeLL-Drive:利用教师LLM引导的深度强化学习增强自动驾驶能力 reinforcement learning deep reinforcement learning DRL
9 Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods 提出SEPO算法,用于策略梯度微调离散扩散模型以解决奖励优化难题 reinforcement learning RLHF

⬅️ 返回 cs.AI 首页 · 🏠 返回主页