cs.LG(2024-05-20)

📊 共 28 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (15 🔗4) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
1 SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model 提出SSAMBA:基于Mamba的自监督音频表征学习模型 Mamba SSM state space model
2 Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space 提出自适应对抗扰动(A2P)方法,提升DRL在动作空间中的鲁棒性 reinforcement learning deep reinforcement learning DRL
3 TinyM$^2$Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment TinyM$^2$Net-V3:面向可持续边缘部署的内存感知压缩多模态深度神经网络 distillation multimodal
4 Continual Deep Reinforcement Learning for Decentralized Satellite Routing 提出基于持续深度强化学习的去中心化卫星路由方案 reinforcement learning deep reinforcement learning DRL
5 Feasibility Consistent Representation Learning for Safe Reinforcement Learning 提出可行性一致性强化学习(FCSRL)框架,解决安全强化学习中安全约束难以估计的问题。 reinforcement learning policy learning representation learning
6 Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls 研究离散动作空间选择对空间控制深度强化学习性能的影响 reinforcement learning deep reinforcement learning
7 Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning 提出LFS方法,通过合成未来观测数据提升强化学习样本效率 reinforcement learning policy learning representation learning
8 Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning 针对上下文离线元强化学习中的任务表征偏移问题,提出一种新的优化框架。 reinforcement learning offline reinforcement learning model-based RL
9 Diffusion for World Modeling: Visual Details Matter in Atari DIAMOND:基于扩散模型的Atari世界模型,提升强化学习智能体性能 reinforcement learning world model
10 Federated Learning for Time-Series Healthcare Sensing with Incomplete Modalities 提出FLISM,解决联邦学习中不完整模态时间序列医疗健康感知问题。 representation learning distillation multimodal
11 Reward-Punishment Reinforcement Learning with Maximum Entropy 提出softDMP算法,通过最大熵奖励-惩罚强化学习提升样本效率和鲁棒性。 reinforcement learning
12 A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback 提出基于线性规划的统一框架,用于离线奖励学习与人类反馈对齐 reinforcement learning inverse reinforcement learning RLHF
13 Efficient Multi-agent Reinforcement Learning by Planning MAZero:结合规划的强化学习提升多智能体系统样本效率 reinforcement learning
14 Asymptotic theory of in-context learning by linear attention 通过线性注意力机制,论文精确解析了Transformer上下文学习的渐近理论。 linear attention
15 Highway Graph to Accelerate Reinforcement Learning 提出Highway Graph加速强化学习,提升确定性离散环境下的训练效率。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
16 Directed Metric Structures arising in Large Language Models 提出一种新的度量结构以解析大型语言模型中的文本扩展问题 large language model
17 A Foundation Model for the Earth System Aurora:地球系统基础模型,显著提升多种环境预测精度与效率。 foundation model
18 Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment 利用大型语言模型GPT4生成乳腺癌治疗新假设并经实验验证 large language model
19 Information Leakage from Embedding in Large Language Models 提出Embed Parrot,提升从大语言模型嵌入中重构用户输入的能力 large language model
20 TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models TinyLLaVA Factory:用于小规模大型多模态模型的可扩展模块化代码库 multimodal
21 Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning FineSSL:通过微调预训练模型解决半监督学习中的偏差问题 foundation model
22 Towards Foundation Model for Chemical Reactor Modeling: Meta-Learning with Physics-Informed Adaptation 提出基于元学习和物理信息自适应的化学反应器建模基础模型 foundation model
23 Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs 提出公理化系统,量化LLM中的记忆效应和上下文推理效应 large language model
24 Data Contamination Calibration for Black-box LLMs 提出极化增强校准(PAC)方法,用于检测和缓解黑盒LLM中的数据污染问题 large language model
25 General bounds on the quality of Bayesian coresets 提出贝叶斯核集质量的通用界限,提升大规模贝叶斯后验推断效率。 multimodal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
26 Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? 提出Decision Mamba (DeMa),在离线强化学习轨迹优化中实现更优性能和参数效率。 trajectory optimization reinforcement learning offline RL
27 Statistically Truthful Auctions via Acceptance Rule 提出STAR方法,通过接受规则实现统计意义上诚实拍卖机制 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
28 PLEIADES: Building Temporal Kernels with Orthogonal Polynomials PLEIADES:利用正交多项式构建时序核,用于事件相机数据的低延迟时空分类与检测。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页