cs.LG(2024-05-30)

📊 共 29 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17 🔗4) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 Diffusion Policies creating a Trust Region for Offline Reinforcement Learning 提出DTQL:通过扩散信任域加速离线强化学习,兼顾性能与效率 reinforcement learning offline RL offline reinforcement learning
2 Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning 提出水下导航基准测试环境,评估并改进深度强化学习算法 reinforcement learning deep reinforcement learning DRL
3 Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models 提出基于重要性采样的扩散模型离线强化学习方法,提升随机数据下的策略学习效果 reinforcement learning offline reinforcement learning world model
4 Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning 提出自适应优势引导策略正则化(A2PR)方法,解决离线强化学习中的过保守问题。 reinforcement learning offline reinforcement learning
5 Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation FoldFlow-2:序列增强的SE(3)-Flow Matching用于条件蛋白质骨架生成 flow matching large language model
6 From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems 从理论角度理解LLM驱动的自主系统,并提出改进策略。 reinforcement learning imitation learning world model
7 Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness 利用策略学习中的表现性,解决群体公平性困境 policy learning predictive model
8 Preference Alignment with Flow Matching 提出Preference Flow Matching,用于高效偏好对齐预训练模型 reinforcement learning flow matching
9 Transformers and Slot Encoding for Sample Efficient Physical World Modelling 提出基于Transformer和Slot Encoding的世界建模方法,提升样本效率。 world model
10 SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents 提出SleeperNets,一种针对强化学习代理的通用后门投毒攻击方法 reinforcement learning
11 FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning 提出基于表征学习的联邦协作在线监测框架,解决异构数据下的资源分配问题 representation learning
12 Hybrid Reinforcement Learning Framework for Mixed-Variable Problems 提出混合强化学习框架,解决混合变量优化问题 reinforcement learning
13 Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints 提出长度无关的PAC界限以优化深度状态空间模型架构 SSM
14 Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation 提出随机化探索算法以解决多项逻辑函数近似的强化学习问题 reinforcement learning
15 Joint Selective State Space Model and Detrending for Robust Time Series Anomaly Detection 提出结合选择性状态空间模型与解趋势的多阶段时间序列异常检测方法 state space model
16 MetaCURL: Non-stationary Concave Utility Reinforcement Learning 提出MetaCURL算法,解决非平稳MDP中的凹效用强化学习问题 reinforcement learning
17 Q-learning as a monotone scheme 将Q-learning解释为单调格式,分析函数逼近对稳定性的影响 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
18 Large Language Models Can Self-Improve At Web Agent Tasks LLM通过自学习提升Web Agent任务性能,在WebArena上提升31% large language model
19 Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine 提出MM-Lego,一种通用的生物医学多模态融合框架,无需或仅需少量微调即可实现高性能。 multimodal
20 SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems SysCaps:利用自然语言接口提升复杂系统仿真代理模型的可用性和泛化性 large language model multimodal
21 ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections 提出ETHER:一种基于超平面反射的高效大模型微调方法,显著降低参数量。 foundation model
22 Knockout: A simple way to handle missing inputs 提出Knockout方法,解决深度学习模型中缺失输入的处理问题 multimodal
23 Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities 提出Kernel Language Entropy (KLE),用于量化大型语言模型中细粒度的语义不确定性 large language model
24 Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts 提出MetRag框架以解决检索增强生成中的相似性依赖问题 large language model
25 Parrot: Efficient Serving of LLM-based Applications with Semantic Variable Parrot:利用语义变量高效服务于基于LLM的应用 large language model
26 Why Larger Language Models Do In-context Learning Differently? 理论分析揭示大语言模型上下文学习差异:模型规模影响噪声敏感性 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
27 Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning 提出傅里叶控制器网络FCNet,用于具身学习中机器人实时决策。 locomotion reinforcement learning
28 Iterative Learning Control of Fast, Nonlinear, Oscillatory Dynamics (Preprint) 提出基于迭代学习控制的快速非线性振荡系统控制方法 trajectory optimization

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
29 Spatiotemporal Predictions of Toxic Urban Plumes Using Deep Learning 提出ST-GasNet深度学习模型,用于快速预测城市有毒气体扩散的时空演变。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页