cs.LG(2024-07-13)
📊 共 7 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (3)
支柱九:具身大模型 (Embodied Foundation Models) (2)
支柱五:交互与反应 (Interaction & Reaction) (2 🔗1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers | Hydra:通过广义矩阵混合器实现双向状态空间模型,显著提升非因果任务性能。 | Mamba SSM state space model | ||
| 2 | Global Reinforcement Learning: Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods | 提出基于次模半梯度方法的全局强化学习,解决传统RL在复杂奖励建模上的局限性 | reinforcement learning imitation learning | ||
| 3 | Model-free Distortion Canceling and Control of Quantum Devices | 提出基于深度强化学习的量子设备无模型失真消除与控制方法 | reinforcement learning deep reinforcement learning DRL |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models | MedLeak:通过恶意构造模型在联邦学习中泄露多模态医疗数据 | multimodal | ||
| 5 | OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling | 提出OptiBench基准和ReSocratic数据合成方法,提升LLM在优化建模中的问题解决能力。 | large language model |
🔬 支柱五:交互与反应 (Interaction & Reaction) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | LFFR: Logistic Function For (single-output) Regression | 提出LFFR算法以实现隐私保护的回归分析 | OMOMO | ||
| 7 | Learning a Mini-batch Graph Transformer via Two-stage Interaction Augmentation | 提出LGMformer,通过两阶段交互增强的Mini-batch图Transformer,提升半监督节点预测性能。 | interaction transformer | ✅ |