cs.LG(2024-10-13)
📊 共 12 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (6 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱一:机器人控制 (Robot Control) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models | 提出LeZO:一种计算与内存高效的零阶优化器,用于微调大型语言模型 | large language model | ||
| 2 | Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation | 提出T-Vaccine,通过层级扰动实现大语言模型针对恶意微调的安全对齐。 | large language model | ✅ | |
| 3 | HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | HARDMath:一个面向应用数学难题的大型语言模型基准数据集 | large language model chain-of-thought | ||
| 4 | A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic Compounds | 提出基于Transformer的生成式化学语言AI模型,用于有机化合物的结构解析。 | large language model | ||
| 5 | ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws | 提出ALLoRA以解决LoRA在短期训练中的局限性 | large language model | ||
| 6 | MoIN: Mixture of Introvert Experts to Upcycle an LLM | MoIN:混合内向专家模型,用于升级现有大型语言模型 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Self-Data Distillation for Recovering Quality in Pruned Large Language Models | 提出自数据蒸馏微调方法,恢复剪枝大语言模型中的质量损失。 | distillation large language model | ||
| 8 | Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions | 提出Segmentation Dreamer,通过任务相关重建提升视觉控制在干扰环境下的泛化性 | reinforcement learning dreamer representation learning | ||
| 9 | Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator | 提出基于双层优化的元强化学习框架,实现通用策略适应性并提供理论保证 | reinforcement learning | ||
| 10 | Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale | 通过简单架构改进和规模扩展提升ProcGen基准测试的泛化能力 | reinforcement learning deep reinforcement learning |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning | SimBa:通过引入简洁性偏置,提升深度强化学习模型参数规模 | humanoid reinforcement learning deep reinforcement learning |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | A Tidal Current Speed Forecasting Model based on Multi-Periodicity Learning | 提出基于多周期学习的潮汐流速预测模型,提升可再生能源并网稳定性 | penetration |