cs.LG（2024-05-30）

📊 共 29 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (17 🔗4) 支柱九：具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱一：机器人控制 (Robot Control) (2 🔗1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (17 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Diffusion Policies creating a Trust Region for Offline Reinforcement Learning	提出DTQL：通过扩散信任域加速离线强化学习，兼顾性能与效率	reinforcement learning offline RL offline reinforcement learning	✅
2	Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning	提出水下导航基准测试环境，评估并改进深度强化学习算法	reinforcement learning deep reinforcement learning DRL
3	Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models	提出基于重要性采样的扩散模型离线强化学习方法，提升随机数据下的策略学习效果	reinforcement learning offline reinforcement learning world model
4	Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning	提出自适应优势引导策略正则化（A2PR）方法，解决离线强化学习中的过保守问题。	reinforcement learning offline reinforcement learning	✅
5	Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation	FoldFlow-2：序列增强的SE(3)-Flow Matching用于条件蛋白质骨架生成	flow matching large language model
6	From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems	从理论角度理解LLM驱动的自主系统，并提出改进策略。	reinforcement learning imitation learning world model
7	Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness	利用策略学习中的表现性，解决群体公平性困境	policy learning predictive model
8	Preference Alignment with Flow Matching	提出Preference Flow Matching，用于高效偏好对齐预训练模型	reinforcement learning flow matching	✅
9	Transformers and Slot Encoding for Sample Efficient Physical World Modelling	提出基于Transformer和Slot Encoding的世界建模方法，提升样本效率。	world model	✅
10	SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents	提出SleeperNets，一种针对强化学习代理的通用后门投毒攻击方法	reinforcement learning
11	FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning	提出基于表征学习的联邦协作在线监测框架，解决异构数据下的资源分配问题	representation learning
12	Hybrid Reinforcement Learning Framework for Mixed-Variable Problems	提出混合强化学习框架，解决混合变量优化问题	reinforcement learning
13	Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints	提出长度无关的PAC界限以优化深度状态空间模型架构	SSM
14	Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation	提出随机化探索算法以解决多项逻辑函数近似的强化学习问题	reinforcement learning
15	Joint Selective State Space Model and Detrending for Robust Time Series Anomaly Detection	提出结合选择性状态空间模型与解趋势的多阶段时间序列异常检测方法	state space model
16	MetaCURL: Non-stationary Concave Utility Reinforcement Learning	提出MetaCURL算法，解决非平稳MDP中的凹效用强化学习问题	reinforcement learning
17	Q-learning as a monotone scheme	将Q-learning解释为单调格式，分析函数逼近对稳定性的影响	reinforcement learning deep reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
18	Large Language Models Can Self-Improve At Web Agent Tasks	LLM通过自学习提升Web Agent任务性能，在WebArena上提升31%	large language model
19	Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine	提出MM-Lego，一种通用的生物医学多模态融合框架，无需或仅需少量微调即可实现高性能。	multimodal
20	SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems	SysCaps：利用自然语言接口提升复杂系统仿真代理模型的可用性和泛化性	large language model multimodal
21	ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections	提出ETHER：一种基于超平面反射的高效大模型微调方法，显著降低参数量。	foundation model	✅
22	Knockout: A simple way to handle missing inputs	提出Knockout方法，解决深度学习模型中缺失输入的处理问题	multimodal
23	Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities	提出Kernel Language Entropy (KLE)，用于量化大型语言模型中细粒度的语义不确定性	large language model
24	Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts	提出MetRag框架以解决检索增强生成中的相似性依赖问题	large language model
25	Parrot: Efficient Serving of LLM-based Applications with Semantic Variable	Parrot：利用语义变量高效服务于基于LLM的应用	large language model
26	Why Larger Language Models Do In-context Learning Differently?	理论分析揭示大语言模型上下文学习差异：模型规模影响噪声敏感性	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning	提出傅里叶控制器网络FCNet，用于具身学习中机器人实时决策。	locomotion reinforcement learning	✅
28	Iterative Learning Control of Fast, Nonlinear, Oscillatory Dynamics (Preprint)	提出基于迭代学习控制的快速非线性振荡系统控制方法	trajectory optimization

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
29	Spatiotemporal Predictions of Toxic Urban Plumes Using Deep Learning	提出ST-GasNet深度学习模型，用于快速预测城市有毒气体扩散的时空演变。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页