| 12 |
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning |
通过建模跨模态不对齐,提升多模态表征学习的性能与可解释性 |
representation learning contrastive learning multimodal |
|
|
| 13 |
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models |
提出基于Mamba的混合线性RNN推理模型M1,提升测试时计算效率。 |
Mamba distillation large language model |
|
|
| 14 |
Achieving Optimal Tissue Repair Through MARL with Reward Shaping and Curriculum Learning |
提出基于MARL的组织修复框架,通过奖励塑造和课程学习优化修复过程 |
reinforcement learning curriculum learning reward shaping |
|
|
| 15 |
Adaptive Sensor Steering Strategy Using Deep Reinforcement Learning for Dynamic Data Acquisition in Digital Twins |
提出基于深度强化学习的自适应传感器控制策略,用于数字孪生中的动态数据采集。 |
reinforcement learning deep reinforcement learning |
|
|
| 16 |
STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data |
STaRFormer:基于动态注意力区域掩码的半监督任务感知序列数据表征学习 |
representation learning contrastive learning spatiotemporal |
|
|
| 17 |
Reasoning without Regret |
提出BARS框架以解决稀疏奖励信号的有效性问题 |
reward shaping large language model chain-of-thought |
|
|
| 18 |
Using Reinforcement Learning to Integrate Subjective Wellbeing into Climate Adaptation Decision Making |
提出强化学习框架以整合主观幸福感于气候适应决策中 |
reinforcement learning |
|
|
| 19 |
AimTS: Augmented Series and Image Contrastive Learning for Time Series Classification |
AimTS:通过增强序列和图像对比学习提升时间序列分类性能 |
contrastive learning |
|
|
| 20 |
Improving Controller Generalization with Dimensionless Markov Decision Processes |
提出基于无量纲MDP的强化学习方法,提升控制器在不同环境下的泛化能力 |
reinforcement learning world model |
|
|
| 21 |
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss |
提出基于期望分位损失的适度Actor-Critic方法,抑制Q函数过估计偏差 |
reinforcement learning SAC |
|
|