cs.LG(2025-02-03)
📊 共 11 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱一:机器人控制 (Robot Control) (1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | On Almost Surely Safe Alignment of Large Language Models at Inference-Time | 提出InferenceGuard,在推理时实现大语言模型近乎完全的安全对齐 | RLHF large language model | ||
| 2 | GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic Environments | 提出GNN-DT,利用图神经网络增强决策Transformer,高效优化动态环境问题。 | reinforcement learning offline RL decision transformer | ||
| 3 | Competitive Programming with Large Reasoning Models | 通过强化学习提升大语言模型在编程竞赛中的推理能力 | reinforcement learning large language model | ||
| 4 | Process Reinforcement through Implicit Rewards | PRIME:通过隐式奖励强化语言模型的过程推理能力,无需显式过程奖励模型训练。 | reinforcement learning large language model | ||
| 5 | Eliciting Language Model Behaviors with Investigator Agents | 提出Investigator Agents,用于搜索诱导语言模型特定行为的prompt。 | reinforcement learning DPO |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning | QLESS:一种量化方法,用于大语言模型微调中的数据估值与选择 | large language model | ||
| 7 | Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection | 提出多模态逆注意力网络MIAN,用于挖掘内在判别特征以提升假新闻检测性能。 | multimodal | ||
| 8 | Logits are All We Need to Adapt Closed Models | 提出Plugin模型,仅通过logits重加权即可有效适配闭源LLM至特定任务。 | large language model | ||
| 9 | Preference Leakage: A Contamination Problem in LLM-as-a-judge | 揭示LLM-as-a-judge中的偏好泄露问题,源于生成器与评估器之间的关联性 | large language model | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning | 提出基于记忆增强的元强化学习方法,提升任务泛化能力 | legged locomotion locomotion reinforcement learning |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Learning Traffic Anomalies from Generative Models on Real-Time Observations | 提出基于时空生成对抗网络的交通异常检测方法,用于实时交通管理。 | spatiotemporal |