cs.LG(2024-07-05)

📊 共 18 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (6) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 The Impact of Quantization and Pruning on Deep Reinforcement Learning Models 研究量化和剪枝对深度强化学习模型性能的影响,旨在资源受限环境下的高效部署。 reinforcement learning deep reinforcement learning DRL
2 Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling 提出RDT,通过序列建模解决离线强化学习中的数据损坏问题 reinforcement learning offline RL offline reinforcement learning
3 Hindsight Preference Learning for Offline Preference-based Reinforcement Learning 提出HPL:利用后见之明偏好学习解决离线偏好强化学习中的信用分配问题 reinforcement learning preference learning
4 Graph Reinforcement Learning for Power Grids: A Comprehensive Survey 图强化学习用于电力系统控制:综述电力网络中基于图强化学习的控制方法。 reinforcement learning representation learning
5 Improving Knowledge Distillation in Transfer Learning with Layer-wise Learning Rates 提出层级学习率的知识蒸馏迁移学习方法,提升复杂任务性能。 distillation
6 Explorative Imitation Learning: A Path Signature Approach for Continuous Environments 提出基于路径签名和探索的模仿学习方法CILO,用于连续控制环境。 imitation learning
7 Understanding the Gains from Repeated Self-Distillation 研究重复自蒸馏的增益,揭示其在降低线性回归风险方面的潜力 distillation
8 Using Petri Nets as an Integrated Constraint Mechanism for Reinforcement Learning Tasks 提出基于Petri网的强化学习约束框架,提升AI可信度并应用于交通信号控制 reinforcement learning
9 Simplifying Deep Temporal Difference Learning 提出PQN:一种简化的深度在线Q学习算法,无需目标网络和经验回放,且性能优异。 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
10 Multimodal Classification via Modal-Aware Interactive Enhancement 提出模态感知交互增强方法,解决多模态学习中的模态不平衡问题。 multimodal
11 Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions 探索大语言模型在星-空-地一体化网络中的应用与未来方向 large language model
12 SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking 提出SpikeLLM,通过基于显著性的脉冲神经网络扩展到大型语言模型,实现高效推理。 large language model
13 On scalable oversight with weak LLMs judging strong LLMs 利用弱LLM作为裁判,评估强LLM的可扩展监督框架研究 large language model multimodal
14 Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models Lazarus:面向MoE模型,实现弹性容错训练,提升训练效率。 large language model
15 LoCo: Low-Bit Communication Adaptor for Large-scale Model Training 提出LoCo低比特通信适配器,解决大规模模型训练中低精度梯度通信的性能下降问题。 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
16 UpStory: the Uppsala Storytelling dataset 发布UpStory数据集,用于儿童互动中Rapport预测的机器学习研究 manipulation dyadic interaction
17 Augmented Bayesian Policy Search 提出增强贝叶斯策略搜索(ABS),结合贝叶斯优化与策略梯度方法解决高维运动控制问题。 locomotion reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
18 Spatiotemporal Forecasting of Traffic Flow using Wavelet-based Temporal Attention 提出基于小波变换时序注意力的图神经网络,用于交通流量时空预测。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页