cs.LG(2024-08-31)

📊 共 5 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (4) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
1 Sparse Mamba: Introducing Controllability, Observability, And Stability To Structural State Space Models Sparse Mamba:通过引入可控性、可观测性和稳定性改进结构化状态空间模型,应用于NLP。 Mamba SSM state space model
2 TSO: Self-Training with Scaled Preference Optimization TSO:通过缩放偏好优化进行自训练,提升LLM与人类偏好的一致性 preference learning DPO direct preference optimization
3 Foundations of Multivariate Distributional Reinforcement Learning 提出 oracle-free 的多变量分布强化学习算法,解决多目标决策等问题。 reinforcement learning representation learning
4 Robust off-policy Reinforcement Learning via Soft Constrained Adversary 提出基于f-散度约束对抗的鲁棒离线强化学习方法 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
5 CoRA: Optimizing Low-Rank Adaptation with Common Subspace of Large Language Models CoRA:利用大语言模型公共子空间优化低秩适应,提升微调效率。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页