cs.LG（2024-05-20）

📊 共 28 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (15 🔗4) 支柱九：具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (15 篇)

#	题目	一句话要点	标签	🔗	⭐
1	SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model	提出SSAMBA：基于Mamba的自监督音频表征学习模型	Mamba SSM state space model
2	Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space	提出自适应对抗扰动（A2P）方法，提升DRL在动作空间中的鲁棒性	reinforcement learning deep reinforcement learning DRL	✅
3	TinyM$^2$Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment	TinyM$^2$Net-V3：面向可持续边缘部署的内存感知压缩多模态深度神经网络	distillation multimodal
4	Continual Deep Reinforcement Learning for Decentralized Satellite Routing	提出基于持续深度强化学习的去中心化卫星路由方案	reinforcement learning deep reinforcement learning DRL
5	Feasibility Consistent Representation Learning for Safe Reinforcement Learning	提出可行性一致性强化学习(FCSRL)框架，解决安全强化学习中安全约束难以估计的问题。	reinforcement learning policy learning representation learning
6	Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls	研究离散动作空间选择对空间控制深度强化学习性能的影响	reinforcement learning deep reinforcement learning
7	Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning	提出LFS方法，通过合成未来观测数据提升强化学习样本效率	reinforcement learning policy learning representation learning
8	Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning	针对上下文离线元强化学习中的任务表征偏移问题，提出一种新的优化框架。	reinforcement learning offline reinforcement learning model-based RL
9	Diffusion for World Modeling: Visual Details Matter in Atari	DIAMOND：基于扩散模型的Atari世界模型，提升强化学习智能体性能	reinforcement learning world model
10	Federated Learning for Time-Series Healthcare Sensing with Incomplete Modalities	提出FLISM，解决联邦学习中不完整模态时间序列医疗健康感知问题。	representation learning distillation multimodal	✅
11	Reward-Punishment Reinforcement Learning with Maximum Entropy	提出softDMP算法，通过最大熵奖励-惩罚强化学习提升样本效率和鲁棒性。	reinforcement learning
12	A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback	提出基于线性规划的统一框架，用于离线奖励学习与人类反馈对齐	reinforcement learning inverse reinforcement learning RLHF
13	Efficient Multi-agent Reinforcement Learning by Planning	MAZero：结合规划的强化学习提升多智能体系统样本效率	reinforcement learning	✅
14	Asymptotic theory of in-context learning by linear attention	通过线性注意力机制，论文精确解析了Transformer上下文学习的渐近理论。	linear attention
15	Highway Graph to Accelerate Reinforcement Learning	提出Highway Graph加速强化学习，提升确定性离散环境下的训练效率。	reinforcement learning	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
16	Directed Metric Structures arising in Large Language Models	提出一种新的度量结构以解析大型语言模型中的文本扩展问题	large language model
17	A Foundation Model for the Earth System	Aurora：地球系统基础模型，显著提升多种环境预测精度与效率。	foundation model
18	Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment	利用大型语言模型GPT4生成乳腺癌治疗新假设并经实验验证	large language model
19	Information Leakage from Embedding in Large Language Models	提出Embed Parrot，提升从大语言模型嵌入中重构用户输入的能力	large language model
20	TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models	TinyLLaVA Factory：用于小规模大型多模态模型的可扩展模块化代码库	multimodal
21	Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning	FineSSL：通过微调预训练模型解决半监督学习中的偏差问题	foundation model	✅
22	Towards Foundation Model for Chemical Reactor Modeling: Meta-Learning with Physics-Informed Adaptation	提出基于元学习和物理信息自适应的化学反应器建模基础模型	foundation model	✅
23	Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs	提出公理化系统，量化LLM中的记忆效应和上下文推理效应	large language model
24	Data Contamination Calibration for Black-box LLMs	提出极化增强校准(PAC)方法，用于检测和缓解黑盒LLM中的数据污染问题	large language model
25	General bounds on the quality of Bayesian coresets	提出贝叶斯核集质量的通用界限，提升大规模贝叶斯后验推断效率。	multimodal

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
26	Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?	提出Decision Mamba (DeMa)，在离线强化学习轨迹优化中实现更优性能和参数效率。	trajectory optimization reinforcement learning offline RL
27	Statistically Truthful Auctions via Acceptance Rule	提出STAR方法，通过接受规则实现统计意义上诚实拍卖机制	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
28	PLEIADES: Building Temporal Kernels with Orthogonal Polynomials	PLEIADES：利用正交多项式构建时序核，用于事件相机数据的低延迟时空分类与检测。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页