cs.LG(2024-06-06)

📊 共 41 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (23 🔗5) 支柱九:具身大模型 (Embodied Foundation Models) (15 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (23 篇)

#题目一句话要点标签🔗
1 Low-Rank Similarity Mining for Multimodal Dataset Distillation 提出LoRS,用于多模态数据集蒸馏,解决图像-文本数据对的相似性学习难题。 contrastive learning distillation multimodal
2 Strategically Conservative Q-Learning 提出策略保守Q学习(SCQ)以解决离线强化学习中过度保守的价值估计问题 reinforcement learning offline RL offline reinforcement learning
3 Aligning Agents like Large Language Models 借鉴LLM训练范式,提升3D环境中智能体通用性和鲁棒性 reinforcement learning large language model
4 Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models 提出SPAC,一种可证明且可扩展的离线对齐方法,用于语言模型。 reinforcement learning offline RL offline reinforcement learning
5 Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning 提出基于矩匹配的离线模型强化学习算法MOMBO,提升确定性不确定性传播效率 reinforcement learning offline reinforcement learning
6 Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models Chimera:利用二维状态空间模型有效建模多元时间序列 Mamba SSM state space model
7 TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification TSCMamba:结合多视角学习与Mamba的多元时间序列分类方法 Mamba state space model
8 Road Network Representation Learning with the Third Law of Geography 提出基于地理学第三定律的路网表征学习框架,提升路段表征在下游任务中的性能。 representation learning contrastive learning
9 Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking 提出连续动作掩码方法,通过聚焦相关动作空间提升强化学习效率 reinforcement learning PPO
10 Open Problem: Active Representation Learning 提出主动表征学习框架,解决部分可观测环境下的探索与表征学习问题 representation learning
11 Mitigating Bias in Dataset Distillation 提出基于核密度估计的重加权方法,缓解数据集蒸馏中的偏差放大问题 distillation
12 ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories ATraDiff:利用生成轨迹加速在线强化学习,解决稀疏奖励问题 reinforcement learning
13 Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment 提出Spread Preference Annotation,利用少量数据高效对齐LLM preference learning large language model
14 What is Dataset Distillation Learning? 研究数据集蒸馏学习,揭示蒸馏数据特性与信息存储方式 distillation
15 Multi-Agent Imitation Learning: Value is Easy, Regret is Hard 针对多智能体模仿学习,提出基于遗憾差距最小化的MALICE和BLADES算法。 imitation learning
16 STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning 提出基于多目标强化学习的STEMO模型,用于提前时空预测,平衡准确性和及时性。 reinforcement learning
17 Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning 提出Mini HoK轻量级环境,促进多智能体强化学习研究与算法创新。 reinforcement learning
18 Breeding Programs Optimization with Reinforcement Learning 提出基于强化学习的育种程序优化方法,提升作物遗传增益。 reinforcement learning
19 Towards Dynamic Trend Filtering through Trend Point Detection with Reinforcement Learning 提出基于强化学习的动态趋势滤波方法,用于捕捉时间序列中的突变趋势。 reinforcement learning
20 Transductive Off-policy Proximal Policy Optimization 提出Transductive Off-policy PPO (ToPPO),提升PPO算法的离线数据利用率 reinforcement learning PPO
21 Improving Actor-Critic Training with Steerable Action-Value Approximation Errors 提出Utility Soft Actor-Critic (USAC),通过可操纵的动作价值近似误差改进Actor-Critic训练。 reinforcement learning deep reinforcement learning
22 How does Inverse RL Scale to Large State Spaces? A Provably Efficient Approach 提出CATY-IRL算法,解决线性MDP中大规模状态空间下的逆强化学习问题 reinforcement learning inverse reinforcement learning
23 Reflective Policy Optimization 提出反射策略优化RPO,提升on-policy强化学习的样本效率 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
24 MuJo: Multimodal Joint Feature Space Learning for Human Activity Recognition 提出MuJo多模态联合特征空间学习方法,提升人体活动识别在多种模态下的性能。 foundation model multimodal
25 From Tissue Plane to Organ World: A Benchmark Dataset for Multimodal Biomedical Image Registration using Deep Co-Attention Networks 提出ATOM基准数据集,并用深度协同注意力网络解决多模态生物医学图像配准问题 multimodal
26 CIRCUITSYNTH: Leveraging Large Language Models for Circuit Topology Synthesis 提出CIRCUITSYNTH,利用大语言模型自动合成电路拓扑 large language model
27 HORAE: A Domain-Agnostic Language for Automated Service Regulation 提出领域无关语言HORAE与RuleGPT,实现服务监管的自动化建模与推理。 large language model multimodal
28 Verbalized Machine Learning: Revisiting Machine Learning with Language Models 提出Verbalized Machine Learning,利用语言模型解决传统机器学习问题并提升可解释性。 large language model
29 Improving Alignment and Robustness with Circuit Breakers 提出基于“断路器”的AI安全机制,提升对有害行为和对抗攻击的防御能力 multimodal
30 Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed 梯度裁剪提升Adam-Norm和AdaGrad-Norm在重尾噪声下的性能 large language model
31 Generative AI-in-the-loop: Integrating LLMs and GPTs into the Next Generation Networks 提出“生成式AI环路”框架,融合LLM与传统ML以提升下一代网络性能。 large language model
32 Open-Endedness is Essential for Artificial Superhuman Intelligence 提出开放性思维以实现人工超人智能的自我提升 foundation model
33 On Limitation of Transformer for Learning HMMs Transformer在学习隐马尔可夫模型上存在局限性,提出Block CoT训练方法以缓解该问题。 chain-of-thought
34 Weight-based Decomposition: A Case for Bilinear MLPs 基于权重的分解:双线性MLP的案例,提升模型可解释性 foundation model
35 Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices 针对资源受限边缘设备,提出部署LLM的经验性指导方案,优化模型定制与部署。 large language model
36 Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective 利用SVD权重剪枝提升大语言模型上下文学习性能 large language model
37 FastGAS: Fast Graph-based Annotation Selection for In-Context Learning FastGAS:用于上下文学习的快速图结构标注选择方法 large language model
38 What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions 揭示自回归模型表征潜在生成分布,探究嵌入向量应编码的内容 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
39 Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation 提出基于模仿学习的强化学习对抗攻击方法及时间折扣正则化防御策略 manipulation reinforcement learning deep reinforcement learning
40 Bootstrapping Expectiles in Reinforcement Learning 提出ExpectRL以解决强化学习中的过估计问题 domain randomization reinforcement learning TD3

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
41 FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models FLUID-LLM:提出时空感知大语言模型用于学习计算流体动力学 spatiotemporal large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页