cs.LG（2024-05-28）

📊 共 36 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (18 🔗5) 支柱九：具身大模型 (Embodied Foundation Models) (14 🔗5) 支柱一：机器人控制 (Robot Control) (3) 支柱五：交互与反应 (Interaction & Reaction) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (18 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Empowering Source-Free Domain Adaptation via MLLM-Guided Reliability-Based Curriculum Learning	提出基于MLLM指导的可靠性课程学习，解决无源域自适应问题	curriculum learning large language model foundation model	✅
2	HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning	HarmoDT：通过和谐参数子空间学习，解决离线多任务强化学习中的策略优化问题	reinforcement learning offline reinforcement learning decision transformer
3	Large Language Model-Driven Curriculum Design for Mobile Networks	提出基于大语言模型的移动网络课程设计框架，提升强化学习性能	reinforcement learning curriculum learning large language model
4	Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination	重新评估动态治疗方案中离线强化学习的应用有效性	reinforcement learning offline RL offline reinforcement learning	✅
5	SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals	SleepFM：通过脑电、心电和呼吸信号的多模态表征学习用于睡眠分析	representation learning contrastive learning foundation model	✅
6	Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication	Atlas3D：物理约束的自支撑文本到3D生成，用于仿真和制造	distillation differentiable simulation embodied AI
7	Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses	对比学习中，从InfoNCE到核方法的损失函数统一性分析与新损失函数DHEL的提出	representation learning contrastive learning
8	Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL	提出离线增强的Actor-Critic算法，自适应融合历史最优行为以提升深度离线策略强化学习性能。	reinforcement learning policy learning offline RL
9	In-Context Symmetries: Self-Supervised Learning through Contextual World Models	提出ContextSSL，通过上下文世界模型自监督学习任务自适应的对称性表示。	world model
10	No $D_{\text{train}}$: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning	提出NTD-CFE，一种无需训练数据的模型无关强化学习反事实解释方法，适用于静态和时序数据。	reinforcement learning
11	Highway Reinforcement Learning	提出高架门以解决多步离策略强化学习中的低估问题	reinforcement learning
12	Back to the Drawing Board for Fair Representation Learning	重新审视公平表征学习：关注迁移任务以避免过拟合代理任务	representation learning
13	Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning	提出ICES，利用个体贡献作为内在探索支架，解决MARL中的稀疏奖励探索问题。	reinforcement learning	✅
14	A Pontryagin Perspective on Reinforcement Learning	提出基于庞特里亚金原理的开环强化学习算法，提升高维控制任务性能	reinforcement learning
15	Mutation-Bias Learning in Games	提出基于演化博弈论的突变偏差多智能体强化学习算法，提升复杂环境收敛性。	reinforcement learning PHC
16	Mollification Effects of Policy Gradient Methods	揭示策略梯度方法对非光滑优化问题的平滑效应及其局限性	reinforcement learning deep reinforcement learning
17	AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization	AlignIQL：通过约束优化在隐式Q学习中实现策略对齐	offline RL IQL
18	Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression	提出对抗密度加权回归(ADR)模仿学习框架，利用辅助数据提升策略性能。	IQL imitation learning	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack	Lisa：针对有害微调攻击，为大语言模型提出惰性安全对齐方法	large language model	✅
20	FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models	FinerCut：一种细粒度可解释的大语言模型层剪枝方法	large language model
21	Pipette: Automatic Fine-grained Large Language Model Training Configurator for Real-World Clusters	Pipette：面向真实集群的自动细粒度大语言模型训练配置器	large language model	✅
22	I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models	I-LLM：面向低比特大语言模型的全量化高效整数推理框架	large language model
23	A Theoretical Understanding of Self-Correction through In-context Alignment	理论分析Transformer的上下文对齐自纠正能力，揭示关键设计的作用	large language model foundation model
24	Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference	提出硬件感知并行Prompt解码(PPD)，加速LLM推理并降低内存占用。	large language model	✅
25	Low-rank finetuning for LLMs: A fairness perspective	揭示低秩微调LLM在公平性上的局限性，强调偏差和毒性缓解的挑战	large language model
26	Unsupervised Model Tree Heritage Recovery	提出无监督模型树溯源方法，解决模型传承关系自动发现难题。	foundation model
27	Outlier-weighed Layerwise Sampling for LLM Fine-tuning	提出Outlier-weighed Layerwise Sampling(OWS)，用于高效微调大型语言模型，提升性能并降低内存需求。	large language model	✅
28	Exploiting LLM Quantization	揭示量化大语言模型的安全漏洞：全精度良性，量化后恶意	large language model
29	An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates	研究LoRA秩对预训练模型增量更新中遗忘的影响	foundation model
30	Efficient Time Series Processing for Transformers and State-Space Models through Token Merging	提出局部合并以提高时间序列处理效率	foundation model
31	Exploring Activation Patterns of Parameters in Language Models	提出基于梯度的参数激活度量方法，探索语言模型内部工作机制。	large language model
32	Linguistic Collapse: Neural Collapse in (Large) Language Models	研究发现大规模语言模型中涌现语言坍缩现象，并揭示其与泛化能力的关联	large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Hierarchical World Models as Visual Whole-Body Humanoid Controllers	提出基于分层世界模型的视觉全身人形机器人控制器	humanoid humanoid control bipedal
34	Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation	提出自适应步长Actor-Critic算法，解决接触动力学模拟中策略学习的梯度误差问题。	locomotion reinforcement learning policy learning
35	Reinforced Model Predictive Control via Trust-Region Quasi-Newton Policy Optimization	提出基于信赖域拟牛顿策略优化的强化模型预测控制，提升数据效率和控制精度。	model predictive control reinforcement learning

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
36	Spectral Truncation Kernels: Noncommutativity in $C^*$-algebraic Kernel Machines	提出基于谱截断的C*-代数核，解决传统核方法非交换性问题	ReMoS

⬅️ 返回 cs.LG 首页 · 🏠 返回主页