cs.LG(2024-08-06)
📊 共 11 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (6 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱八:物理动画 (Physics-based Animation) (2)
🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Can DPO Learn Diverse Human Values? A Theoretical Scaling Law | 提出DPO泛化理论框架,分析LLM学习多样化人类价值观的尺度规律 | preference learning DPO direct preference optimization | ||
| 2 | Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning | 提出基于深度强化学习的自动驾驶决策策略,提升复杂交通场景适应性 | reinforcement learning deep reinforcement learning PPO | ||
| 3 | Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning | 提出一种高效自适应奖励塑造机制,解决强化学习中的稀疏奖励问题 | reinforcement learning reward shaping | ||
| 4 | RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning | 提出RELIEF,利用强化学习优化图特征提示调优,提升图表示学习的泛化性和数据效率。 | reinforcement learning representation learning | ✅ | |
| 5 | Spacecraft inertial parameters estimation using time series clustering and reinforcement learning | 提出基于时序聚类和强化学习的航天器惯性参数估计方法 | reinforcement learning | ||
| 6 | Prioritize Alignment in Dataset Distillation | 提出PAD:通过对齐信息优先级,显著提升数据集蒸馏性能 | distillation |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | LAMPO: Large Language Models as Preference Machines for Few-shot Ordinal Classification | LAMPO:利用大语言模型作为偏好机器,解决少样本序数分类问题 | large language model | ||
| 8 | Can LLMs Serve As Time Series Anomaly Detectors? | 探索LLM作为时间序列异常检测器的潜力,通过提示工程和微调提升性能 | large language model chain-of-thought | ||
| 9 | LLM-Aided Compilation for Tensor Accelerators | 利用LLM辅助张量加速器编译,提升硬件设计灵活性与性能 | large language model |
🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | A Differential Smoothness-based Compact-Dynamic Graph Convolutional Network for Spatiotemporal Signal Recovery | 提出CDGCN模型,用于解决时空信号恢复中现有方法无法有效捕捉时空相关性的问题。 | spatiotemporal | ||
| 11 | Data-Driven Stochastic Closure Modeling via Conditional Diffusion Model and Neural Operator | 提出基于条件扩散模型和神经算子的数据驱动随机闭包建模方法 | spatiotemporal |