cs.LG(2024-10-13)

📊 共 12 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (6 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
1 Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models 提出LeZO:一种计算与内存高效的零阶优化器,用于微调大型语言模型 large language model
2 Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation 提出T-Vaccine,通过层级扰动实现大语言模型针对恶意微调的安全对齐。 large language model
3 HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics HARDMath:一个面向应用数学难题的大型语言模型基准数据集 large language model chain-of-thought
4 A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic Compounds 提出基于Transformer的生成式化学语言AI模型,用于有机化合物的结构解析。 large language model
5 ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws 提出ALLoRA以解决LoRA在短期训练中的局限性 large language model
6 MoIN: Mixture of Introvert Experts to Upcycle an LLM MoIN:混合内向专家模型,用于升级现有大型语言模型 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
7 Self-Data Distillation for Recovering Quality in Pruned Large Language Models 提出自数据蒸馏微调方法,恢复剪枝大语言模型中的质量损失。 distillation large language model
8 Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions 提出Segmentation Dreamer,通过任务相关重建提升视觉控制在干扰环境下的泛化性 reinforcement learning dreamer representation learning
9 Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator 提出基于双层优化的元强化学习框架,实现通用策略适应性并提供理论保证 reinforcement learning
10 Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale 通过简单架构改进和规模扩展提升ProcGen基准测试的泛化能力 reinforcement learning deep reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
11 SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning SimBa:通过引入简洁性偏置,提升深度强化学习模型参数规模 humanoid reinforcement learning deep reinforcement learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
12 A Tidal Current Speed Forecasting Model based on Multi-Periodicity Learning 提出基于多周期学习的潮汐流速预测模型,提升可再生能源并网稳定性 penetration

⬅️ 返回 cs.LG 首页 · 🏠 返回主页