cs.LG（2024-05-27）

📊 共 24 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱五：交互与反应 (Interaction & Reaction) (1 🔗1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales	提出对称强化学习损失，增强RL在多样任务和模型规模下的鲁棒性	reinforcement learning PPO RLHF
2	FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation	FedHPL：基于Prompt Tuning和Logit蒸馏的高效异构联邦学习框架	distillation foundation model
3	Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges	提出线性函数近似的NPG方法，加速解决低维强化学习问题	reinforcement learning PPO
4	Partial Models for Building Adaptive Model-Based Reinforcement Learning Agents	提出基于局部模型的自适应模型强化学习方法，提升环境局部变化适应性	reinforcement learning dreamer
5	A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning	提出SADA：一种通用的视觉强化学习数据增强方法，提升训练稳定性和泛化性	reinforcement learning	✅
6	Finding Shared Decodable Concepts and their Negations in the Brain	提出基于对比学习和聚类的脑活动解码方法，发现大脑中共享的可解码概念及其否定概念。	contrastive learning multimodal
7	Spectral regularization for adversarially-robust representation learning	提出谱正则化方法，提升表征学习的对抗鲁棒性，尤其适用于自监督学习。	representation learning
8	SMR: State Memory Replay for Long Sequence Modeling	提出状态记忆回放机制SMR，解决SSM长序列建模中的非稳定状态问题	Mamba SSM state space model
9	How Do the Architecture and Optimizer Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks	研究架构和优化器如何影响深度神经网络表征学习的训练动态	representation learning
10	Opinion-Guided Reinforcement Learning	提出意见引导的强化学习方法，利用不确定性意见提升智能体学习效率。	reinforcement learning
11	Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning	提出一种自适应内在动机的无监督强化学习方法，提升智能体在不同熵环境下的学习能力。	reinforcement learning
12	Oracle-Efficient Reinforcement Learning for Max Value Ensembles	提出一种高效强化学习算法，通过最大值集成策略提升已有策略性能	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
13	Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models	提出VISAGE指标，通过探索LLM安全域评估微调风险	large language model	✅
14	Phase Transitions in the Output Distribution of Large Language Models	利用物理学相变检测方法，自动发现大语言模型输出分布中的行为变化。	large language model
15	Mechanistic Interpretability of Binary and Ternary Transformers	研究二元和三元Transformer网络的可解释性，揭示其与全精度网络的算法相似性	large language model
16	Salutary Labeling with Zero Human Annotation	提出Salutary Labeling，无需人工标注即可为信息量大的样本分配最优标签，提升模型性能。	large language model
17	LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters	LoRA-XS：极小参数量的低秩适应微调方法，显著降低存储需求。	large language model
18	RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects	RTL-Repo：用于评估LLM在大型RTL设计项目上的基准测试	large language model
19	$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning	提出Trans-LoRA，实现LoRA模块在不同基模型间的无损近无数据迁移。	large language model
20	Autoformalizing Euclidean Geometry	提出结合领域知识、SMT求解器和LLM的欧几里得几何自动形式化框架	large language model	✅
21	CHESS: Contextual Harnessing for Efficient SQL Synthesis	CHESS：利用上下文信息的高效SQL合成多智能体框架	large language model
22	Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space	提出基于能量的隐空间探索方法，用于离线黑盒优化。	multimodal

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Interpretable Prognostics with Concept Bottleneck Models	提出基于概念瓶颈模型的可解释剩余寿命预测方法，提升工业资产预测可信度。	IMoS	✅

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics	提出BeamVQ，通过物理感知指标自训练对齐时空预测模型，显著提升预测的物理合理性。	physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页