cs.LG(2025-03-19)
📊 共 26 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (12 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (12)
支柱一:机器人控制 (Robot Control) (2 🔗1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Continual Multimodal Contrastive Learning | 提出一种基于优化的持续多模态对比学习方法,解决模态数据增量式学习中的灾难性遗忘问题。 | contrastive learning multimodal | ✅ | |
| 2 | Towards Achieving Perfect Multimodal Alignment | 提出完美多模态对齐方法,提升跨模态表征学习与迁移性能 | representation learning multimodal | ||
| 3 | Learning Topology Actions for Power Grid Control: A Graph-Based Soft-Label Imitation Learning Approach | 提出基于图神经网络和软标签模仿学习的电力网络拓扑控制方法 | reinforcement learning deep reinforcement learning DRL | ||
| 4 | Application of linear regression and quasi-Newton methods to the deep reinforcement learning in continuous action cases | 提出DLS-DDPG方法,结合线性回归与拟牛顿法改进连续动作空间下的深度强化学习。 | reinforcement learning deep reinforcement learning | ||
| 5 | VIPER: Visual Perception and Explainable Reasoning for Sequential Decision-Making | VIPER:用于序列决策的视觉感知与可解释推理框架 | reinforcement learning large language model multimodal | ||
| 6 | Continual Contrastive Learning on Tabular Data with Out of Distribution | 提出TCCL:用于表格数据的持续对比学习框架,提升OOD泛化能力 | representation learning contrastive learning | ||
| 7 | Good Actions Succeed, Bad Actions Generalize: A Case Study on Why RL Generalizes Better | 对比监督学习与强化学习在视觉导航中的泛化能力,揭示强化学习更优泛化的内在机制。 | reinforcement learning PPO behavior cloning | ||
| 8 | Robustness of Nonlinear Representation Learning | 研究非线性表示学习的鲁棒性,提出近似等距混合下的可辨识性分析方法 | representation learning | ||
| 9 | Partially Observable Reinforcement Learning with Memory Traces | 提出基于记忆轨迹的强化学习方法,解决部分可观测环境下的长时依赖问题 | reinforcement learning | ||
| 10 | LogLLaMA: Transformer-based log anomaly detection with LLaMA | LogLLaMA:利用LLaMA进行日志异常检测,显著优于现有方法 | reinforcement learning large language model | ||
| 11 | What Makes a Reward Model a Good Teacher? An Optimization Perspective | 揭示奖励模型有效性的关键:优化视角下的方差重要性 | reinforcement learning RLHF | ||
| 12 | Multi-Agent Actor-Critic with Harmonic Annealing Pruning for Dynamic Spectrum Access Systems | 提出基于谐波退火剪枝的多智能体Actor-Critic算法,用于动态频谱接入系统。 | reinforcement learning deep reinforcement learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 25 | Diffusion-Based Forecasting for Uncertainty-Aware Model Predictive Control | 提出基于扩散模型的预测控制框架,用于不确定性感知决策 | MPC model predictive control reinforcement learning | ||
| 26 | 1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities | 通过扩展深度至1000层,提升自监督强化学习在目标导向任务中的性能。 | locomotion manipulation reinforcement learning | ✅ |