| 10 |
Bridging Modalities via Progressive Re-alignment for Multimodal Test-Time Adaptation |
提出BriMPR框架,通过渐进式重对齐解决多模态测试时自适应问题 |
contrastive learning multimodal |
✅ |
|
| 11 |
PerfMamba: Performance Analysis and Pruning of Selective State Space Models |
PerfMamba:通过性能分析和剪枝优化选择性状态空间模型 |
Mamba SSM state space model |
|
|
| 12 |
A Hierarchical Hybrid AI Approach: Integrating Deep Reinforcement Learning and Scripted Agents in Combat Simulations |
提出分层混合AI方法,融合深度强化学习与脚本智能体,提升作战模拟性能。 |
reinforcement learning deep reinforcement learning |
|
|
| 13 |
LFM2 Technical Report |
LFM2:面向边缘设备高效部署的Liquid Foundation Models,兼顾速度与性能。 |
curriculum learning distillation foundation model |
|
|
| 14 |
SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments |
SmallWorlds:在隔离环境中评估世界模型的动态理解能力 |
world model state space model representation learning |
|
|
| 15 |
OBLR-PO: A Theoretical Framework for Stable Reinforcement Learning |
提出OBLR-PO算法,通过理论指导的自适应学习率和基线优化,提升LLM的RL后训练稳定性。 |
reinforcement learning large language model |
|
|
| 16 |
ThetaEvolve: Test-time Learning on Open Problems |
ThetaEvolve:面向开放问题的测试时学习框架,实现持续进化。 |
reinforcement learning reward shaping large language model |
✅ |
|
| 17 |
ASTRO: Adaptive Stitching via Dynamics-Guided Trajectory Rollouts |
ASTRO:通过动态引导的轨迹展开实现自适应拼接,提升离线强化学习性能 |
reinforcement learning policy learning offline RL |
|
|
| 18 |
Emergent Coordination and Phase Structure in Independent Multi-Agent Reinforcement Learning |
揭示独立多智能体强化学习中的涌现协调与相结构,关注规模、密度与核漂移的相互作用。 |
reinforcement learning IQL |
|
|