cs.LG(2024-09-10)

📊 共 9 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (4) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
1 Length Desensitization in Direct Preference Optimization 提出LD-DPO,解决DPO训练中大语言模型对文本长度的过度优化问题。 reinforcement learning RLHF DPO
2 Double Successive Over-Relaxation Q-Learning with an Extension to Deep Reinforcement Learning 提出双重逐次超松弛Q学习算法,加速收敛并降低过估计偏差,并扩展到深度强化学习。 reinforcement learning deep reinforcement learning
3 LAMP: Learnable Meta-Path Guided Adversarial Contrastive Learning for Heterogeneous Graphs 提出LAMP以解决异构图对比学习中的标签依赖问题 contrastive learning
4 Geometric-Averaged Preference Optimization for Soft Preference Labels 提出几何平均偏好优化算法,利用软偏好标签提升LLM对齐效果 DPO direct preference optimization

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
5 Beyond designer's knowledge: Generating materials design hypotheses via large language models 利用大语言模型生成材料设计假设,突破设计者知识局限 large language model
6 Scaling Law Hypothesis for Multimodal Model 提出多模态模型Scaling Law假设,预测跨模态数据训练性能。 multimodal
7 Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Ferret:面向大规模LLM的联邦全参数调优,兼顾效率与精度 large language model
8 STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning 提出结构化后无结构化剪枝以提升MoE模型的可扩展性 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
9 Probabilistic Spatiotemporal Modeling of Day-Ahead Wind Power Generation with Input-Warped Gaussian Processes 提出基于输入翘曲高斯过程的时空模型,用于日前风电功率的概率预测。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页