| 18 |
Objects matter: object-centric world models improve reinforcement learning in visually complex environments |
提出OC-STORM,利用对象中心世界模型提升视觉复杂环境中强化学习的样本效率。 |
reinforcement learning deep reinforcement learning world model |
|
|
| 19 |
Inverse Reinforcement Learning via Convex Optimization |
提出基于凸优化的逆强化学习方法,提升鲁棒性和可复现性。 |
reinforcement learning inverse reinforcement learning |
|
|
| 20 |
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity |
提出Mixture-of-Mamba以解决多模态状态空间模型的稀疏性问题 |
Mamba SSM state space model |
✅ |
|
| 21 |
Upside Down Reinforcement Learning with Policy Generators |
提出基于策略生成器的倒置强化学习(UDRLPG)框架,提升强化学习样本效率。 |
reinforcement learning multimodal |
✅ |
|
| 22 |
Application of Structured State Space Models to High energy physics with locality-sensitive hashing |
提出基于局部敏感哈希的结构化状态空间模型,用于解决高能物理领域长序列处理难题。 |
Mamba SSM state space model |
|
|
| 23 |
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging |
提出sDREAMER模型,利用自蒸馏混合模态专家Transformer进行自动睡眠分期 |
dreamer distillation |
|
|
| 24 |
Towards General-Purpose Model-Free Reinforcement Learning |
提出MR.Q算法,通过模型表示线性化值函数,实现通用无模型强化学习。 |
reinforcement learning model-based RL |
|
|
| 25 |
Training Dynamics of In-Context Learning in Linear Attention |
研究线性注意力中上下文学习的训练动态,揭示参数化方式对学习过程的影响 |
linear attention |
|
|
| 26 |
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model |
研究噪声高斯混合模型中自蒸馏的有效性,揭示其去噪机制并提出优化策略。 |
distillation |
|
|
| 27 |
ReFill: Reinforcement Learning for Fill-In Minimization |
ReFill:提出基于强化学习的填充最小化方法,提升稀疏线性系统求解效率 |
reinforcement learning |
|
|
| 28 |
Multi-Objective Reinforcement Learning for Power Grid Topology Control |
提出基于多目标强化学习的电网拓扑控制方法,优化线路负载和拓扑结构。 |
reinforcement learning |
|
|
| 29 |
Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment |
提出高效的基于Logit的知识蒸馏方法以解决深度脉冲神经网络的时间步部署问题 |
distillation |
✅ |
|
| 30 |
The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective |
提出在线强化学习样本复杂度分析方法以应对非线性动态系统 |
reinforcement learning |
|
|
| 31 |
Benchmarking Quantum Reinforcement Learning |
提出一种量子强化学习的基准测试方法,用于评估和验证量子算法的性能。 |
reinforcement learning |
|
|
| 32 |
Foundation for unbiased cross-validation of spatio-temporal models for species distribution modeling |
提出基于空间自相关的交叉验证方法,提升物种分布模型时空泛化能力。 |
SAC MAE |
|
|
| 33 |
Challenging Assumptions in Learning Generic Text Style Embeddings |
提出基于对比学习的通用文本风格嵌入方法,并反思现有假设 |
representation learning contrastive learning |
|
|