cs.LG(2024-09-03)
📊 共 10 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (4)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration | SmileyLlama:通过指令微调LLM实现定向化学空间探索 | reinforcement learning direct preference optimization large language model | ||
| 2 | A Deep Reinforcement Learning Framework For Financial Portfolio Management | 提出深度强化学习框架以解决金融投资组合管理问题 | reinforcement learning deep reinforcement learning | ||
| 3 | Reinforcement Learning-enabled Satellite Constellation Reconfiguration and Retasking for Mission-Critical Applications | 提出基于强化学习的卫星星座重构与重定向方法,应对任务关键型应用中的卫星失效问题 | reinforcement learning PPO | ||
| 4 | Multi-Agent Reinforcement Learning for Joint Police Patrol and Dispatch | 提出基于多智能体强化学习的联合巡逻调度方法,优化警务效率。 | reinforcement learning | ||
| 5 | Large-scale Urban Facility Location Selection with Knowledge-informed Reinforcement Learning | 提出知识驱动强化学习方法,高效解决大规模城市设施选址问题 | reinforcement learning | ✅ |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | TimeDiT: General-purpose Diffusion Transformers for Time Series Foundation Model | TimeDiT:用于时间序列基础模型的通用扩散Transformer | large language model foundation model | ||
| 7 | A Multimodal Object-level Contrast Learning Method for Cancer Survival Risk Prediction | 提出多模态对象级对比学习方法,用于提升癌症生存风险预测精度。 | multimodal | ||
| 8 | Foundations of Large Language Model Compression -- Part 1: Weight Quantization | 提出CVXQ:基于凸优化的LLM权重压缩框架,实现灵活的模型大小控制。 | large language model | ||
| 9 | RACONTEUR: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer | Raconteur:一个基于LLM的、博学且可移植的Shell命令解释器 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Generative Principal Component Regression via Variational Inference | 提出基于变分推断的生成式主成分回归(gPCR),提升复杂系统干预目标选择。 | manipulation predictive model |