cs.LG(2024-05-06)

📊 共 21 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱八:物理动画 (Physics-based Animation) (2 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 Enhancing Q-Learning with Large Language Model Heuristics 提出LLM引导的Q-learning,提升强化学习采样效率并避免偏差。 reinforcement learning reward shaping large language model
2 Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning 提出反向-正向课程学习(RFCL),提升强化学习在稀疏奖励任务中的样本和演示效率。 reinforcement learning curriculum learning
3 A Generalization Theory of Cross-Modality Distillation with Contrastive Learning 提出跨模态对比蒸馏框架CMCD,并从理论上分析模态距离对泛化性能的影响。 contrastive learning distillation
4 Federated Reinforcement Learning with Constraint Heterogeneity 提出FedNPG和FedPPO,解决约束异构下的联邦强化学习问题 reinforcement learning PPO large language model
5 Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows 提出MOOD-CRL算法,通过因果归一化流解决离线强化学习中的分布外适应问题。 reinforcement learning policy learning offline RL
6 Position: Leverage Foundational Models for Black-Box Optimization 利用序列模型赋能黑盒优化:探索LLM在实验设计中的应用 reinforcement learning large language model foundation model
7 Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning 提出Chunked-TD算法,加速强化学习中的信用分配 reinforcement learning world model
8 End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability 提出基于深度强化学习的配电网有功无功协调控制方法,解决部分可观测性下的电压越限问题。 reinforcement learning deep reinforcement learning
9 Functional Latent Dynamics for Irregularly Sampled Time Series Forecasting 提出函数潜在动力学(FLD)模型,高效解决不规则采样时间序列预测问题。 latent dynamics
10 ReinWiFi: Application-Layer QoS Optimization of WiFi Networks with Reinforcement Learning 提出基于强化学习的ReinWiFi框架,优化异构应用下WiFi网络的QoS reinforcement learning
11 Improved Forward-Forward Contrastive Learning 提出改进的Forward-Forward对比学习算法,无需反向传播,更具生物合理性 contrastive learning
12 Policy Learning for Balancing Short-Term and Long-Term Rewards 提出一种平衡短期和长期回报的策略学习框架,解决长期影响评估问题。 policy learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
13 To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models 针对LLM中记忆数据的遗忘,提出个性化遗忘策略以提升隐私保护。 large language model
14 Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs 针对LLM量化,提出Student Float (SF4)格式,优化精度与效率权衡。 large language model
15 Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models 提出基于离群梯度分析的高效方法,用于识别深度学习中的有害训练样本。 large language model
16 Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond 理论分析Softmax激活函数,揭示其在扩散模型中的优化特性与应用 large language model
17 TED: Accelerate Model Training by Internal Generalization 提出TED剪枝方法,通过内部泛化加速模型训练并压缩数据集。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
18 Denoising of Geodetic Time Series Using Spatiotemporal Graph Neural Networks: Application to Slow Slip Event Extraction 提出基于时空图神经网络的SSEdenoiser,用于大地测量时间序列的去噪和慢滑事件提取。 spatiotemporal
19 Spatiotemporal Implicit Neural Representation as a Generalized Traffic Data Learner 提出基于时空隐式神经表示的通用交通数据学习框架,解决现有方法泛化性不足的问题。 spatiotemporal

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
20 GLIP: Electromagnetic Field Exposure Map Completion by Deep Generative Networks 提出GLIP,利用深度生成网络完成电磁场暴露图补全,无需显式训练。 sparse sensors

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
21 Decentralized Online Learning in General-Sum Stackelberg Games 研究一般和Stackelberg博弈中的分散式在线学习,提出信息操纵策略。 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页