| 1 |
Assessing the Potential of Masked Autoencoder Foundation Models in Predicting Downhole Metrics from Surface Drilling Data |
评估掩码自编码器基础模型在利用地面钻井数据预测井下参数方面的潜力 |
masked autoencoder foundation model |
|
|
| 2 |
Learning Ad Hoc Network Dynamics via Graph-Structured World Models |
提出G-RSSM,通过图结构世界模型学习Ad hoc网络动态,用于size无关的节点决策。 |
reinforcement learning deep reinforcement learning world model |
|
|
| 3 |
DLink: Distilling Layer-wise and Dominant Knowledge from EEG Foundation Models |
DLink:从脑电图基础模型中蒸馏分层和主导知识,实现轻量化部署。 |
teacher-student distillation foundation model |
|
|
| 4 |
MambaSL: Exploring Single-Layer Mamba for Time Series Classification |
MambaSL:探索单层Mamba模型在时间序列分类中的应用 |
Mamba SSM state space model |
|
|
| 5 |
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning |
LongAct:利用内在激活模式提升长文本强化学习性能 |
reinforcement learning large language model |
|
|
| 6 |
On the Expressive Power and Limitations of Multi-Layer SSMs |
揭示多层SSM在组合任务中的局限性,并探索在线CoT如何提升其表达能力 |
SSM chain-of-thought |
|
|
| 7 |
RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning |
提出RL-STPA框架,用于安全关键强化学习中的系统性风险分析。 |
reinforcement learning reward shaping |
|
|
| 8 |
Wasserstein Formulation of Reinforcement Learning. An Optimal Transport Perspective on Policy Optimization |
提出基于Wasserstein空间的强化学习框架,优化策略。 |
reinforcement learning |
|
|
| 9 |
Beyond Importance Sampling: Rejection-Gated Policy Optimization |
提出RGPO,通过可学习的接受门控优化策略,提升强化学习的稳定性和性能。 |
PPO RLHF |
|
|