cs.LG(2024-05-28)

📊 共 36 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗5) 支柱九:具身大模型 (Embodied Foundation Models) (14 🔗5) 支柱一:机器人控制 (Robot Control) (3) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 Empowering Source-Free Domain Adaptation via MLLM-Guided Reliability-Based Curriculum Learning 提出基于MLLM指导的可靠性课程学习,解决无源域自适应问题 curriculum learning large language model foundation model
2 HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning HarmoDT:通过和谐参数子空间学习,解决离线多任务强化学习中的策略优化问题 reinforcement learning offline reinforcement learning decision transformer
3 Large Language Model-Driven Curriculum Design for Mobile Networks 提出基于大语言模型的移动网络课程设计框架,提升强化学习性能 reinforcement learning curriculum learning large language model
4 Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination 重新评估动态治疗方案中离线强化学习的应用有效性 reinforcement learning offline RL offline reinforcement learning
5 SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals SleepFM:通过脑电、心电和呼吸信号的多模态表征学习用于睡眠分析 representation learning contrastive learning foundation model
6 Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication Atlas3D:物理约束的自支撑文本到3D生成,用于仿真和制造 distillation differentiable simulation embodied AI
7 Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses 对比学习中,从InfoNCE到核方法的损失函数统一性分析与新损失函数DHEL的提出 representation learning contrastive learning
8 Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL 提出离线增强的Actor-Critic算法,自适应融合历史最优行为以提升深度离线策略强化学习性能。 reinforcement learning policy learning offline RL
9 In-Context Symmetries: Self-Supervised Learning through Contextual World Models 提出ContextSSL,通过上下文世界模型自监督学习任务自适应的对称性表示。 world model
10 No $D_{\text{train}}$: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning 提出NTD-CFE,一种无需训练数据的模型无关强化学习反事实解释方法,适用于静态和时序数据。 reinforcement learning
11 Highway Reinforcement Learning 提出高架门以解决多步离策略强化学习中的低估问题 reinforcement learning
12 Back to the Drawing Board for Fair Representation Learning 重新审视公平表征学习:关注迁移任务以避免过拟合代理任务 representation learning
13 Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning 提出ICES,利用个体贡献作为内在探索支架,解决MARL中的稀疏奖励探索问题。 reinforcement learning
14 A Pontryagin Perspective on Reinforcement Learning 提出基于庞特里亚金原理的开环强化学习算法,提升高维控制任务性能 reinforcement learning
15 Mutation-Bias Learning in Games 提出基于演化博弈论的突变偏差多智能体强化学习算法,提升复杂环境收敛性。 reinforcement learning PHC
16 Mollification Effects of Policy Gradient Methods 揭示策略梯度方法对非光滑优化问题的平滑效应及其局限性 reinforcement learning deep reinforcement learning
17 AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization AlignIQL:通过约束优化在隐式Q学习中实现策略对齐 offline RL IQL
18 Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression 提出对抗密度加权回归(ADR)模仿学习框架,利用辅助数据提升策略性能。 IQL imitation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
19 Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack Lisa:针对有害微调攻击,为大语言模型提出惰性安全对齐方法 large language model
20 FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models FinerCut:一种细粒度可解释的大语言模型层剪枝方法 large language model
21 Pipette: Automatic Fine-grained Large Language Model Training Configurator for Real-World Clusters Pipette:面向真实集群的自动细粒度大语言模型训练配置器 large language model
22 I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models I-LLM:面向低比特大语言模型的全量化高效整数推理框架 large language model
23 A Theoretical Understanding of Self-Correction through In-context Alignment 理论分析Transformer的上下文对齐自纠正能力,揭示关键设计的作用 large language model foundation model
24 Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference 提出硬件感知并行Prompt解码(PPD),加速LLM推理并降低内存占用。 large language model
25 Low-rank finetuning for LLMs: A fairness perspective 揭示低秩微调LLM在公平性上的局限性,强调偏差和毒性缓解的挑战 large language model
26 Unsupervised Model Tree Heritage Recovery 提出无监督模型树溯源方法,解决模型传承关系自动发现难题。 foundation model
27 Outlier-weighed Layerwise Sampling for LLM Fine-tuning 提出Outlier-weighed Layerwise Sampling(OWS),用于高效微调大型语言模型,提升性能并降低内存需求。 large language model
28 Exploiting LLM Quantization 揭示量化大语言模型的安全漏洞:全精度良性,量化后恶意 large language model
29 An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates 研究LoRA秩对预训练模型增量更新中遗忘的影响 foundation model
30 Efficient Time Series Processing for Transformers and State-Space Models through Token Merging 提出局部合并以提高时间序列处理效率 foundation model
31 Exploring Activation Patterns of Parameters in Language Models 提出基于梯度的参数激活度量方法,探索语言模型内部工作机制。 large language model
32 Linguistic Collapse: Neural Collapse in (Large) Language Models 研究发现大规模语言模型中涌现语言坍缩现象,并揭示其与泛化能力的关联 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
33 Hierarchical World Models as Visual Whole-Body Humanoid Controllers 提出基于分层世界模型的视觉全身人形机器人控制器 humanoid humanoid control bipedal
34 Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation 提出自适应步长Actor-Critic算法,解决接触动力学模拟中策略学习的梯度误差问题。 locomotion reinforcement learning policy learning
35 Reinforced Model Predictive Control via Trust-Region Quasi-Newton Policy Optimization 提出基于信赖域拟牛顿策略优化的强化模型预测控制,提升数据效率和控制精度。 model predictive control reinforcement learning

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
36 Spectral Truncation Kernels: Noncommutativity in $C^*$-algebraic Kernel Machines 提出基于谱截断的C*-代数核,解决传统核方法非交换性问题 ReMoS

⬅️ 返回 cs.LG 首页 · 🏠 返回主页