cs.LG(2024-10-06)
📊 共 11 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (6)
支柱二:RL算法与架构 (RL & Architecture) (4 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective | 全面分析硬件平台对大语言模型推理加速的影响与优化方法 | large language model multimodal | ||
| 2 | Large Language Models for Knowledge-Free Network Management: Feasibility Study and Opportunities | 利用大型语言模型实现知识无关的网络管理 | large language model foundation model | ||
| 3 | EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles | EnsemW2S:利用大语言模型集成提升弱到强泛化能力 | large language model | ||
| 4 | Hammer: Robust Function-Calling for On-Device Language Models via Function Masking | Hammer:通过函数掩码实现设备端语言模型稳健的函数调用 | large language model foundation model | ||
| 5 | On Evaluating LLMs' Capabilities as Functional Approximators: A Bayesian Perspective | 提出贝叶斯视角评估框架,揭示LLM作为函数逼近器的能力局限与优势 | large language model | ||
| 6 | Continuous Approximations for Improving Quantization Aware Training of LLMs | 提出连续近似方法,提升LLM量化感知训练的性能 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Latent Feature Mining for Predictive Model Enhancement with Large Language Models | 提出FLAME框架,利用大语言模型挖掘潜在特征,提升预测模型在弱相关数据场景下的性能。 | predictive model large language model | ||
| 8 | Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF | 提出REFUEL,解决LLM多轮对话中因协变量偏移导致的策略优化难题。 | reinforcement learning RLHF DPO | ✅ | |
| 9 | Robustness Reprogramming for Representation Learning | 提出鲁棒性重编程方法,无需修改模型参数即可提升模型抗扰动能力 | representation learning | ||
| 10 | AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning | AdaMemento:面向稀疏奖励强化学习的自适应记忆增强策略优化 | reinforcement learning |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Bisimulation metric for Model Predictive Control | 提出基于Bisimulation Metric的MPC方法,提升模型预测控制的稳定性和鲁棒性。 | MPC model predictive control reinforcement learning |