cs.LG(2024-10-31)

📊 共 40 篇论文 | 🔗 8 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (20 🔗5) 支柱九:具身大模型 (Embodied Foundation Models) (15 🔗3) 支柱一:机器人控制 (Robot Control) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (20 篇)

#题目一句话要点标签🔗
1 Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers 利用强化学习梯度优化在线微调决策Transformer,提升低奖励数据预训练模型的性能。 reinforcement learning TD3 offline reinforcement learning
2 SambaMixer: State of Health Prediction of Li-ion Batteries using Mamba State Space Models 提出SambaMixer以预测锂离子电池的健康状态 Mamba SSM state space model
3 An Information Criterion for Controlled Disentanglement of Multimodal Data 提出DisentangledSSL,用于多模态数据中可控的解耦表征学习。 representation learning multimodal
4 Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment 提出基于任务对齐的逆强化学习框架,提升复杂环境与迁移学习性能 reinforcement learning imitation learning inverse reinforcement learning
5 RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning 提出风险感知偏好强化学习算法RA-PbRL以解决AI安全问题 reinforcement learning RLHF large language model
6 EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization 提出EARL-BO,利用强化学习解决高维贝叶斯优化中的多步前瞻问题。 reinforcement learning policy learning
7 Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning 提出基于组合自动机嵌入的目标条件强化学习方法 reinforcement learning
8 Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning 提出一种模型无关的元学习安全强化学习框架,通过渐进式安全保障提升安全性。 reinforcement learning
9 Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs 提出局部线性化方法以解决连续MDP中的无悔强化学习问题 reinforcement learning
10 Noise as a Double-Edged Sword: Reinforcement Learning Exploits Randomized Defenses in Neural Networks 研究噪声对强化学习攻击的影响,提出更精细的防御策略 reinforcement learning
11 Enhancing Chess Reinforcement Learning with Graph Representation 提出基于图表示的强化学习方法,提升国际象棋AI的泛化性和训练效率。 reinforcement learning
12 A Non-Monolithic Policy Approach of Offline-to-Online Reinforcement Learning 提出非单体策略的离线-在线强化学习方法,提升在线策略学习效率 reinforcement learning
13 VecCity: A Taxonomy-guided Library for Map Entity Representation Learning VecCity:一个用于地图实体表示学习的分类引导库,旨在统一评估和促进模型复用。 representation learning
14 Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI 提出基于多目标强化学习的自适应对齐框架,动态调整AI以适应多元用户偏好。 reinforcement learning
15 How Do Flow Matching Models Memorize and Generalize in Sample Data Subspaces? 提出流匹配模型以解决样本数据子空间中的记忆与泛化问题 flow matching
16 Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models 提出基于扩散模型的解耦表示学习方法,提升隐变量单元的可解释性和独立性。 DRL representation learning
17 Dynamical similarity analysis can identify compositional dynamics developing in RNNs 提出动态相似性分析(DSA),用于识别RNN中组合动态的学习过程,优于现有方法。 Mamba state space model
18 Maximum Entropy Hindsight Experience Replay 提出最大熵后见之明经验回放(MaxEnt-HER)算法,提升目标导向强化学习PPO算法性能 reinforcement learning PPO
19 Deterministic Exploration via Stationary Bellman Error Maximization 提出基于平稳贝尔曼误差最大化的确定性探索方法,提升强化学习探索效率 reinforcement learning policy learning
20 CALE: Continuous Arcade Learning Environment 提出CALE:扩展ALE以支持连续动作控制的街机学习环境 PPO SAC

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
21 OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models 提出OCEAN框架,用于离线评估和优化大语言模型的思维链能力。 large language model chain-of-thought
22 LLaMo: Large Language Model-based Molecular Graph Assistant LLaMo:基于大语言模型的分子图助手,实现分子理解与生成 large language model instruction following
23 LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators LLM-Inference-Bench:用于AI加速器上大语言模型推理的基准测试套件 large language model
24 Matchmaker: Self-Improving Large Language Model Programs for Schema Matching Matchmaker:基于自提升大语言模型程序的模式匹配方法 large language model
25 In-Context Fine-Tuning for Time-Series Foundation Models 提出时间序列基础模型的上下文微调方法,提升零样本预测性能 foundation model
26 Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models 提出上下文感知测试(CAT),利用大语言模型指导模型测试,发现潜在失效场景。 large language model
27 Metamorphic Malware Evolution: The Potential and Peril of Large Language Models 利用大型语言模型进行变种恶意软件演化研究与检测框架构建 large language model
28 End-to-End Ontology Learning with Large Language Models OLLM:一种基于大型语言模型的端到端本体学习方法,提升语义准确性和结构完整性。 large language model
29 Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales 提出双重正确预测框架,保证视觉识别模型预测结果与解释的正确性 foundation model
30 MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level Guarantees 提出MESS+以优化语言模型选择中的能效问题 large language model
31 AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and Validation AlphaTrans:一种神经符号组合方法,用于仓库级代码翻译与验证 large language model
32 EigenVI: score-based variational inference with orthogonal function expansions 提出EigenVI以解决黑箱变分推断中的高效性问题 multimodal
33 Failure Modes of LLMs for Causal Reasoning on Narratives 揭示LLM在叙事因果推理中的失效模式,并提出改进方法 large language model
34 RAGraph: A General Retrieval-Augmented Graph Learning Framework 提出RAGraph以解决图神经网络在未见图数据上的泛化问题 foundation model
35 Automatically Learning Hybrid Digital Twins of Dynamical Systems 提出自动学习混合数字双胞胎以解决动态系统建模问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
36 $π_0$: A Vision-Language-Action Flow Model for General Robot Control 提出基于视觉-语言-动作流模型的通用机器人控制框架$π_0$,提升机器人泛化能力。 dual-arm flow matching vision-language-action
37 A Universal Quantum Computer From Relativistic Motion 提出基于相对论运动的通用量子计算机架构,利用变分量子电路实现量子计算。 manipulation

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
38 Graph Neural Networks Uncover Geometric Neural Representations in Reinforcement-Based Motor Learning 利用图神经网络揭示强化运动学习中神经表征的几何特性 spatial relationship

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
39 DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis DiffBatt:提出基于扩散模型的电池退化预测与合成方法 classifier-free guidance

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
40 APEBench: A Benchmark for Autoregressive Neural Emulators of PDEs APEBench:用于偏微分方程自回归神经模拟器的基准测试平台 differentiable simulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页