cs.LG(2023-12-27)

📊 共 10 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 Adaptive trajectory-constrained exploration strategy for deep reinforcement learning 提出自适应轨迹约束探索策略以解决深度强化学习中的探索问题 reinforcement learning deep reinforcement learning DRL
2 Model Selection for Inverse Reinforcement Learning via Structural Risk Minimization 提出基于结构风险最小化的逆强化学习模型选择方法 reinforcement learning inverse reinforcement learning
3 Preference as Reward, Maximum Preference Optimization with Importance Sampling 提出基于重要性采样的最大偏好优化算法(MPO),提升语言模型与人类价值观对齐效果。 reinforcement learning PPO preference learning
4 Soft Contrastive Learning for Time Series 提出SoftCLT,通过软对比学习提升时间序列表征质量。 contrastive learning TAMP
5 MIM4DD: Mutual Information Maximization for Dataset Distillation MIM4DD:通过互信息最大化实现数据集蒸馏,提升信息保留度 contrastive learning distillation
6 Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning 提出动态子图蒸馏(DSGD)方法,解决半监督持续学习中的灾难性遗忘问题。 distillation
7 Foundations of Reinforcement Learning and Interactive Decision Making 构建强化学习与交互决策的统计基础理论框架,关注函数逼近和高维反馈问题 reinforcement learning
8 Active Third-Person Imitation Learning 提出主动第三人称模仿学习框架,解决视角选择问题 imitation learning
9 Learning to Embed Time Series Patches Independently 提出独立时间序列块嵌入方法,提升时间序列预测与分类性能。 representation learning contrastive learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
10 How Robust are LLMs to In-Context Majority Label Bias? 研究LLM在上下文学习中对多数标签偏差的鲁棒性 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页