cs.LG（2024-05-06）

📊 共 21 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12) 支柱九：具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱八：物理动画 (Physics-based Animation) (2 🔗1) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Enhancing Q-Learning with Large Language Model Heuristics	提出LLM引导的Q-learning，提升强化学习采样效率并避免偏差。	reinforcement learning reward shaping large language model
2	Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning	提出反向-正向课程学习(RFCL)，提升强化学习在稀疏奖励任务中的样本和演示效率。	reinforcement learning curriculum learning
3	A Generalization Theory of Cross-Modality Distillation with Contrastive Learning	提出跨模态对比蒸馏框架CMCD，并从理论上分析模态距离对泛化性能的影响。	contrastive learning distillation
4	Federated Reinforcement Learning with Constraint Heterogeneity	提出FedNPG和FedPPO，解决约束异构下的联邦强化学习问题	reinforcement learning PPO large language model
5	Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows	提出MOOD-CRL算法，通过因果归一化流解决离线强化学习中的分布外适应问题。	reinforcement learning policy learning offline RL
6	Position: Leverage Foundational Models for Black-Box Optimization	利用序列模型赋能黑盒优化：探索LLM在实验设计中的应用	reinforcement learning large language model foundation model
7	Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning	提出Chunked-TD算法，加速强化学习中的信用分配	reinforcement learning world model
8	End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability	提出基于深度强化学习的配电网有功无功协调控制方法，解决部分可观测性下的电压越限问题。	reinforcement learning deep reinforcement learning
9	Functional Latent Dynamics for Irregularly Sampled Time Series Forecasting	提出函数潜在动力学（FLD）模型，高效解决不规则采样时间序列预测问题。	latent dynamics
10	ReinWiFi: Application-Layer QoS Optimization of WiFi Networks with Reinforcement Learning	提出基于强化学习的ReinWiFi框架，优化异构应用下WiFi网络的QoS	reinforcement learning
11	Improved Forward-Forward Contrastive Learning	提出改进的Forward-Forward对比学习算法，无需反向传播，更具生物合理性	contrastive learning
12	Policy Learning for Balancing Short-Term and Long-Term Rewards	提出一种平衡短期和长期回报的策略学习框架，解决长期影响评估问题。	policy learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
13	To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models	针对LLM中记忆数据的遗忘，提出个性化遗忘策略以提升隐私保护。	large language model
14	Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs	针对LLM量化，提出Student Float (SF4)格式，优化精度与效率权衡。	large language model	✅
15	Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models	提出基于离群梯度分析的高效方法，用于识别深度学习中的有害训练样本。	large language model
16	Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond	理论分析Softmax激活函数，揭示其在扩散模型中的优化特性与应用	large language model
17	TED: Accelerate Model Training by Internal Generalization	提出TED剪枝方法，通过内部泛化加速模型训练并压缩数据集。	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
18	Denoising of Geodetic Time Series Using Spatiotemporal Graph Neural Networks: Application to Slow Slip Event Extraction	提出基于时空图神经网络的SSEdenoiser，用于大地测量时间序列的去噪和慢滑事件提取。	spatiotemporal
19	Spatiotemporal Implicit Neural Representation as a Generalized Traffic Data Learner	提出基于时空隐式神经表示的通用交通数据学习框架，解决现有方法泛化性不足的问题。	spatiotemporal	✅

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
20	GLIP: Electromagnetic Field Exposure Map Completion by Deep Generative Networks	提出GLIP，利用深度生成网络完成电磁场暴露图补全，无需显式训练。	sparse sensors

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	Decentralized Online Learning in General-Sum Stackelberg Games	研究一般和Stackelberg博弈中的分散式在线学习，提出信息操纵策略。	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页