cs.LG（2024-06-06）

📊 共 41 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (23 🔗5) 支柱九：具身大模型 (Embodied Foundation Models) (15 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (23 篇)

#	题目	一句话要点	标签	🔗
1	Low-Rank Similarity Mining for Multimodal Dataset Distillation	提出LoRS，用于多模态数据集蒸馏，解决图像-文本数据对的相似性学习难题。	contrastive learning distillation multimodal	✅
2	Strategically Conservative Q-Learning	提出策略保守Q学习(SCQ)以解决离线强化学习中过度保守的价值估计问题	reinforcement learning offline RL offline reinforcement learning	✅
3	Aligning Agents like Large Language Models	借鉴LLM训练范式，提升3D环境中智能体通用性和鲁棒性	reinforcement learning large language model	✅
4	Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models	提出SPAC，一种可证明且可扩展的离线对齐方法，用于语言模型。	reinforcement learning offline RL offline reinforcement learning
5	Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning	提出基于矩匹配的离线模型强化学习算法MOMBO，提升确定性不确定性传播效率	reinforcement learning offline reinforcement learning
6	Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models	Chimera：利用二维状态空间模型有效建模多元时间序列	Mamba SSM state space model
7	TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification	TSCMamba：结合多视角学习与Mamba的多元时间序列分类方法	Mamba state space model
8	Road Network Representation Learning with the Third Law of Geography	提出基于地理学第三定律的路网表征学习框架，提升路段表征在下游任务中的性能。	representation learning contrastive learning
9	Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking	提出连续动作掩码方法，通过聚焦相关动作空间提升强化学习效率	reinforcement learning PPO
10	Open Problem: Active Representation Learning	提出主动表征学习框架，解决部分可观测环境下的探索与表征学习问题	representation learning
11	Mitigating Bias in Dataset Distillation	提出基于核密度估计的重加权方法，缓解数据集蒸馏中的偏差放大问题	distillation
12	ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories	ATraDiff：利用生成轨迹加速在线强化学习，解决稀疏奖励问题	reinforcement learning
13	Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment	提出Spread Preference Annotation，利用少量数据高效对齐LLM	preference learning large language model
14	What is Dataset Distillation Learning?	研究数据集蒸馏学习，揭示蒸馏数据特性与信息存储方式	distillation
15	Multi-Agent Imitation Learning: Value is Easy, Regret is Hard	针对多智能体模仿学习，提出基于遗憾差距最小化的MALICE和BLADES算法。	imitation learning
16	STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning	提出基于多目标强化学习的STEMO模型，用于提前时空预测，平衡准确性和及时性。	reinforcement learning
17	Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning	提出Mini HoK轻量级环境，促进多智能体强化学习研究与算法创新。	reinforcement learning	✅
18	Breeding Programs Optimization with Reinforcement Learning	提出基于强化学习的育种程序优化方法，提升作物遗传增益。	reinforcement learning
19	Towards Dynamic Trend Filtering through Trend Point Detection with Reinforcement Learning	提出基于强化学习的动态趋势滤波方法，用于捕捉时间序列中的突变趋势。	reinforcement learning
20	Transductive Off-policy Proximal Policy Optimization	提出Transductive Off-policy PPO (ToPPO)，提升PPO算法的离线数据利用率	reinforcement learning PPO
21	Improving Actor-Critic Training with Steerable Action-Value Approximation Errors	提出Utility Soft Actor-Critic (USAC)，通过可操纵的动作价值近似误差改进Actor-Critic训练。	reinforcement learning deep reinforcement learning
22	How does Inverse RL Scale to Large State Spaces? A Provably Efficient Approach	提出CATY-IRL算法，解决线性MDP中大规模状态空间下的逆强化学习问题	reinforcement learning inverse reinforcement learning
23	Reflective Policy Optimization	提出反射策略优化RPO，提升on-policy强化学习的样本效率	reinforcement learning PPO	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (15 篇)

#	题目	一句话要点	标签	🔗
24	MuJo: Multimodal Joint Feature Space Learning for Human Activity Recognition	提出MuJo多模态联合特征空间学习方法，提升人体活动识别在多种模态下的性能。	foundation model multimodal
25	From Tissue Plane to Organ World: A Benchmark Dataset for Multimodal Biomedical Image Registration using Deep Co-Attention Networks	提出ATOM基准数据集，并用深度协同注意力网络解决多模态生物医学图像配准问题	multimodal	✅
26	CIRCUITSYNTH: Leveraging Large Language Models for Circuit Topology Synthesis	提出CIRCUITSYNTH，利用大语言模型自动合成电路拓扑	large language model
27	HORAE: A Domain-Agnostic Language for Automated Service Regulation	提出领域无关语言HORAE与RuleGPT，实现服务监管的自动化建模与推理。	large language model multimodal
28	Verbalized Machine Learning: Revisiting Machine Learning with Language Models	提出Verbalized Machine Learning，利用语言模型解决传统机器学习问题并提升可解释性。	large language model
29	Improving Alignment and Robustness with Circuit Breakers	提出基于“断路器”的AI安全机制，提升对有害行为和对抗攻击的防御能力	multimodal
30	Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed	梯度裁剪提升Adam-Norm和AdaGrad-Norm在重尾噪声下的性能	large language model
31	Generative AI-in-the-loop: Integrating LLMs and GPTs into the Next Generation Networks	提出“生成式AI环路”框架，融合LLM与传统ML以提升下一代网络性能。	large language model
32	Open-Endedness is Essential for Artificial Superhuman Intelligence	提出开放性思维以实现人工超人智能的自我提升	foundation model
33	On Limitation of Transformer for Learning HMMs	Transformer在学习隐马尔可夫模型上存在局限性，提出Block CoT训练方法以缓解该问题。	chain-of-thought
34	Weight-based Decomposition: A Case for Bilinear MLPs	基于权重的分解：双线性MLP的案例，提升模型可解释性	foundation model
35	Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices	针对资源受限边缘设备，提出部署LLM的经验性指导方案，优化模型定制与部署。	large language model
36	Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective	利用SVD权重剪枝提升大语言模型上下文学习性能	large language model	✅
37	FastGAS: Fast Graph-based Annotation Selection for In-Context Learning	FastGAS：用于上下文学习的快速图结构标注选择方法	large language model
38	What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions	揭示自回归模型表征潜在生成分布，探究嵌入向量应编码的内容	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
39	Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation	提出基于模仿学习的强化学习对抗攻击方法及时间折扣正则化防御策略	manipulation reinforcement learning deep reinforcement learning
40	Bootstrapping Expectiles in Reinforcement Learning	提出ExpectRL以解决强化学习中的过估计问题	domain randomization reinforcement learning TD3

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
41	FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models	FLUID-LLM：提出时空感知大语言模型用于学习计算流体动力学	spatiotemporal large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2024-06-06）

🎯 兴趣领域导航

🔬 支柱二：RL算法与架构 (RL & Architecture) (23 篇)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (15 篇)

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理