cs.LG(2025-02-03)

📊 共 11 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 On Almost Surely Safe Alignment of Large Language Models at Inference-Time 提出InferenceGuard,在推理时实现大语言模型近乎完全的安全对齐 RLHF large language model
2 GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic Environments 提出GNN-DT,利用图神经网络增强决策Transformer,高效优化动态环境问题。 reinforcement learning offline RL decision transformer
3 Competitive Programming with Large Reasoning Models 通过强化学习提升大语言模型在编程竞赛中的推理能力 reinforcement learning large language model
4 Process Reinforcement through Implicit Rewards PRIME:通过隐式奖励强化语言模型的过程推理能力,无需显式过程奖励模型训练。 reinforcement learning large language model
5 Eliciting Language Model Behaviors with Investigator Agents 提出Investigator Agents,用于搜索诱导语言模型特定行为的prompt。 reinforcement learning DPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
6 QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning QLESS:一种量化方法,用于大语言模型微调中的数据估值与选择 large language model
7 Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection 提出多模态逆注意力网络MIAN,用于挖掘内在判别特征以提升假新闻检测性能。 multimodal
8 Logits are All We Need to Adapt Closed Models 提出Plugin模型,仅通过logits重加权即可有效适配闭源LLM至特定任务。 large language model
9 Preference Leakage: A Contamination Problem in LLM-as-a-judge 揭示LLM-as-a-judge中的偏好泄露问题,源于生成器与评估器之间的关联性 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
10 Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning 提出基于记忆增强的元强化学习方法,提升任务泛化能力 legged locomotion locomotion reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
11 Learning Traffic Anomalies from Generative Models on Real-Time Observations 提出基于时空生成对抗网络的交通异常检测方法,用于实时交通管理。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页