cs.LG(2024-09-10)
📊 共 9 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Length Desensitization in Direct Preference Optimization | 提出LD-DPO,解决DPO训练中大语言模型对文本长度的过度优化问题。 | reinforcement learning RLHF DPO | ||
| 2 | Double Successive Over-Relaxation Q-Learning with an Extension to Deep Reinforcement Learning | 提出双重逐次超松弛Q学习算法,加速收敛并降低过估计偏差,并扩展到深度强化学习。 | reinforcement learning deep reinforcement learning | ||
| 3 | LAMP: Learnable Meta-Path Guided Adversarial Contrastive Learning for Heterogeneous Graphs | 提出LAMP以解决异构图对比学习中的标签依赖问题 | contrastive learning | ||
| 4 | Geometric-Averaged Preference Optimization for Soft Preference Labels | 提出几何平均偏好优化算法,利用软偏好标签提升LLM对齐效果 | DPO direct preference optimization |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Beyond designer's knowledge: Generating materials design hypotheses via large language models | 利用大语言模型生成材料设计假设,突破设计者知识局限 | large language model | ||
| 6 | Scaling Law Hypothesis for Multimodal Model | 提出多模态模型Scaling Law假设,预测跨模态数据训练性能。 | multimodal | ||
| 7 | Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models | Ferret:面向大规模LLM的联邦全参数调优,兼顾效率与精度 | large language model | ✅ | |
| 8 | STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning | 提出结构化后无结构化剪枝以提升MoE模型的可扩展性 | large language model |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | Probabilistic Spatiotemporal Modeling of Day-Ahead Wind Power Generation with Input-Warped Gaussian Processes | 提出基于输入翘曲高斯过程的时空模型,用于日前风电功率的概率预测。 | spatiotemporal |