cs.LG（2024-10-13）

📊 共 12 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (6 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (4) 支柱一：机器人控制 (Robot Control) (1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models	提出LeZO：一种计算与内存高效的零阶优化器，用于微调大型语言模型	large language model
2	Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation	提出T-Vaccine，通过层级扰动实现大语言模型针对恶意微调的安全对齐。	large language model	✅
3	HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	HARDMath：一个面向应用数学难题的大型语言模型基准数据集	large language model chain-of-thought
4	A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic Compounds	提出基于Transformer的生成式化学语言AI模型，用于有机化合物的结构解析。	large language model
5	ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws	提出ALLoRA以解决LoRA在短期训练中的局限性	large language model
6	MoIN: Mixture of Introvert Experts to Upcycle an LLM	MoIN：混合内向专家模型，用于升级现有大型语言模型	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
7	Self-Data Distillation for Recovering Quality in Pruned Large Language Models	提出自数据蒸馏微调方法，恢复剪枝大语言模型中的质量损失。	distillation large language model
8	Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions	提出Segmentation Dreamer，通过任务相关重建提升视觉控制在干扰环境下的泛化性	reinforcement learning dreamer representation learning
9	Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator	提出基于双层优化的元强化学习框架，实现通用策略适应性并提供理论保证	reinforcement learning
10	Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale	通过简单架构改进和规模扩展提升ProcGen基准测试的泛化能力	reinforcement learning deep reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning	SimBa：通过引入简洁性偏置，提升深度强化学习模型参数规模	humanoid reinforcement learning deep reinforcement learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
12	A Tidal Current Speed Forecasting Model based on Multi-Periodicity Learning	提出基于多周期学习的潮汐流速预测模型，提升可再生能源并网稳定性	penetration

⬅️ 返回 cs.LG 首页 · 🏠 返回主页