cs.AI(2024-10-17)

📊 共 15 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (5) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
1 Large Language Models as Narrative-Driven Recommenders 利用大型语言模型进行叙事驱动的电影推荐,显著优于传统方法。 large language model
2 Best in Tau@LLMJudge: Criteria-Based Relevance Evaluation with Llama3 提出基于Llama3和多维度标准的LLMJudge评估方法,提升信息检索系统评估的准确性。 large language model
3 ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries 提出实体追踪框架ETF,用于检测代码摘要中的幻觉问题。 large language model
4 AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents AgentOccam:通过优化观察和动作空间,显著提升LLM驱动的Web Agent性能 large language model
5 Optimal Quantization for Matrix Multiplication 针对矩阵乘法,提出一种基于嵌套格的渐近最优量化方案 large language model
6 Rapid and Automated Alloy Design with Graph Neural Network-Powered LLM-Driven Multi-Agent Systems 提出基于图神经网络和LLM驱动的多智能体系统,加速合金设计 multimodal
7 FIRE: Fact-checking with Iterative Retrieval and Verification 提出FIRE框架,通过迭代检索与验证进行高效的事实核查。 large language model
8 Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents 提出Chain-of-Ideas智能体,利用LLM革新科研选题,降低成本并媲美人类水平。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
9 Holistic Utility Preference Learning for Listwise Alignment 提出DRPO,通过优化排序偏好解决LLM对齐中的列表级偏好学习问题 reinforcement learning preference learning RLHF
10 Anchored Alignment for Self-Explanations Enhancement 提出锚定对齐方法,提升大语言模型在无标注情况下的自解释能力 DPO direct preference optimization large language model
11 Goal Inference from Open-Ended Dialog 提出一种在线方法,通过对话进行目标推断,提升具身智能体完成用户目标的能力。 RLHF large language model
12 Approximating Auction Equilibria with Reinforcement Learning 提出基于强化学习的拍卖均衡近似方法,解决复杂拍卖场景下的计算难题。 reinforcement learning
13 Transformer Guided Coevolution: Improved Team Selection in Multiagent Adversarial Team Games 提出BERTeam算法,利用Transformer提升多智能体对抗博弈中的团队选择 reinforcement learning deep reinforcement learning

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
14 MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation MobA:多方面记忆增强的自适应规划,用于高效移动任务自动化 spatial relationship large language model multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
15 Persistent Pre-Training Poisoning of LLMs 揭示LLM预训练阶段投毒攻击的持久性,仅需0.1%投毒率即可持续影响微调后的模型。 manipulation DPO large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页