cs.LG(2024-05-27)

📊 共 24 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱五:交互与反应 (Interaction & Reaction) (1 🔗1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales 提出对称强化学习损失,增强RL在多样任务和模型规模下的鲁棒性 reinforcement learning PPO RLHF
2 FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation FedHPL:基于Prompt Tuning和Logit蒸馏的高效异构联邦学习框架 distillation foundation model
3 Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges 提出线性函数近似的NPG方法,加速解决低维强化学习问题 reinforcement learning PPO
4 Partial Models for Building Adaptive Model-Based Reinforcement Learning Agents 提出基于局部模型的自适应模型强化学习方法,提升环境局部变化适应性 reinforcement learning dreamer
5 A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning 提出SADA:一种通用的视觉强化学习数据增强方法,提升训练稳定性和泛化性 reinforcement learning
6 Finding Shared Decodable Concepts and their Negations in the Brain 提出基于对比学习和聚类的脑活动解码方法,发现大脑中共享的可解码概念及其否定概念。 contrastive learning multimodal
7 Spectral regularization for adversarially-robust representation learning 提出谱正则化方法,提升表征学习的对抗鲁棒性,尤其适用于自监督学习。 representation learning
8 SMR: State Memory Replay for Long Sequence Modeling 提出状态记忆回放机制SMR,解决SSM长序列建模中的非稳定状态问题 Mamba SSM state space model
9 How Do the Architecture and Optimizer Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks 研究架构和优化器如何影响深度神经网络表征学习的训练动态 representation learning
10 Opinion-Guided Reinforcement Learning 提出意见引导的强化学习方法,利用不确定性意见提升智能体学习效率。 reinforcement learning
11 Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning 提出一种自适应内在动机的无监督强化学习方法,提升智能体在不同熵环境下的学习能力。 reinforcement learning
12 Oracle-Efficient Reinforcement Learning for Max Value Ensembles 提出一种高效强化学习算法,通过最大值集成策略提升已有策略性能 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
13 Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models 提出VISAGE指标,通过探索LLM安全域评估微调风险 large language model
14 Phase Transitions in the Output Distribution of Large Language Models 利用物理学相变检测方法,自动发现大语言模型输出分布中的行为变化。 large language model
15 Mechanistic Interpretability of Binary and Ternary Transformers 研究二元和三元Transformer网络的可解释性,揭示其与全精度网络的算法相似性 large language model
16 Salutary Labeling with Zero Human Annotation 提出Salutary Labeling,无需人工标注即可为信息量大的样本分配最优标签,提升模型性能。 large language model
17 LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters LoRA-XS:极小参数量的低秩适应微调方法,显著降低存储需求。 large language model
18 RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects RTL-Repo:用于评估LLM在大型RTL设计项目上的基准测试 large language model
19 $\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning 提出Trans-LoRA,实现LoRA模块在不同基模型间的无损近无数据迁移。 large language model
20 Autoformalizing Euclidean Geometry 提出结合领域知识、SMT求解器和LLM的欧几里得几何自动形式化框架 large language model
21 CHESS: Contextual Harnessing for Efficient SQL Synthesis CHESS:利用上下文信息的高效SQL合成多智能体框架 large language model
22 Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space 提出基于能量的隐空间探索方法,用于离线黑盒优化。 multimodal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
23 Interpretable Prognostics with Concept Bottleneck Models 提出基于概念瓶颈模型的可解释剩余寿命预测方法,提升工业资产预测可信度。 IMoS

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
24 BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics 提出BeamVQ,通过物理感知指标自训练对齐时空预测模型,显著提升预测的物理合理性。 physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页