| 1 |
HyMaTE: A Hybrid Mamba and Transformer Model for EHR Representation Learning |
提出HyMaTE,结合Mamba和Transformer用于提升EHR表征学习效果 |
Mamba SSM state space model |
✅ |
|
| 2 |
Dynamic Policy Induction for Adaptive Prompt Optimization: Bridging the Efficiency-Accuracy Gap via Lightweight Reinforcement Learning |
提出Prompt Policy Network,通过轻量级强化学习自适应优化LLM Prompt策略,提升效率并保持精度。 |
reinforcement learning PPO large language model |
|
|
| 3 |
InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions |
提出InfMasking,通过对比多模态交互增强协同信息,提升多模态表征学习效果。 |
representation learning multimodal |
✅ |
|
| 4 |
In-Context Compositional Q-Learning for Offline Reinforcement Learning |
提出ICQL,利用上下文学习进行离线强化学习中的组合Q函数估计 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 5 |
A Weather Foundation Model for the Power Grid |
针对电网的定制化天气预报基础模型,提升极端天气事件预警能力。 |
MAE foundation model |
|
|
| 6 |
MemMamba: Rethinking Memory Patterns in State Space Model |
MemMamba:通过状态总结和跨层注意力,改进状态空间模型的长序列记忆能力。 |
Mamba state space model |
|
|
| 7 |
Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression |
揭示Mamba在上下文线性回归中模拟在线梯度下降的机理 |
Mamba SSM foundation model |
|
|
| 8 |
Explore-Execute Chain: Towards an Efficient Structured Reasoning Paradigm |
提出Explore-Execute Chain框架,解耦规划与执行,提升LLM推理效率与可解释性。 |
reinforcement learning large language model chain-of-thought |
✅ |
|
| 9 |
DRIK: Distribution-Robust Inductive Kriging without Information Leakage |
DRIK:一种分布鲁棒的归纳克里金方法,避免信息泄露,提升时空数据泛化能力 |
MAE sparse sensors spatial relationship |
|
|
| 10 |
GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning |
GPS-MTM:利用自监督学习捕获GPS轨迹中的常态模式 |
trajectory transformer representation learning foundation model |
|
|
| 11 |
Curriculum-Guided Reinforcement Learning for Synthesizing Gas-Efficient Financial Derivatives Contracts |
提出基于课程学习的强化学习框架,用于合成高 gas 效率的金融衍生品智能合约。 |
reinforcement learning PPO |
|
|
| 12 |
Adversarial Diffusion for Robust Reinforcement Learning |
提出AD-RRL,利用对抗扩散模型提升强化学习在不确定环境中的鲁棒性 |
reinforcement learning model-based RL |
|
|
| 13 |
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention |
提出SLA:一种可微调的稀疏线性注意力机制,加速Diffusion Transformer模型。 |
linear attention |
✅ |
|
| 14 |
GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries |
提出GeoFunFlow以解决复杂几何体上的逆问题 |
flow matching |
|
|
| 15 |
Bridging On-Device and Cloud LLMs for Collaborative Reasoning: A Unified Methodology for Local Routing and Post-Training |
提出一种设备-云协同推理方法,通过强化学习提升端侧LLM的路由和推理能力。 |
reinforcement learning large language model |
|
|
| 16 |
Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning |
提出基于风险寻求乐观主义的多智能体强化学习方法,提升合作博弈性能 |
reinforcement learning |
|
|
| 17 |
Guide: Generalized-Prior and Data Encoders for DAG Estimation |
GUIDE:融合LLM先验与数据编码的DAG估计框架 |
reinforcement learning large language model |
|
|
| 18 |
Space Group Conditional Flow Matching |
提出空间群条件流匹配模型,用于生成具有高对称性的稳定晶体结构。 |
flow matching |
|
|
| 19 |
An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms |
提出模式感知批归一化(MA-BN),提升离线Actor-Critic算法的稳定性和性能 |
reinforcement learning deep reinforcement learning DRL |
✅ |
|
| 20 |
Why Alignment Must Precede Distillation: A Minimal Working Explanation |
提出对齐先于蒸馏策略,解决知识蒸馏后模型对齐效果不佳的问题 |
distillation |
|
|
| 21 |
Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation |
DART:解耦训练与自适应数据管理,提升GUI智能体多轮强化学习效率 |
reinforcement learning policy learning |
|
|