| 1 |
Max-Entropy Reinforcement Learning with Flow Matching and A Case Study on LQR |
提出基于Flow Matching的最大熵强化学习算法,提升策略表达能力与鲁棒性 |
reinforcement learning SAC flow matching |
|
|
| 2 |
Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL |
Splitwise:基于Lyapunov优化的DRL实现LLM在边缘-云协同推理的自适应切分。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 3 |
Stochastic Siamese MAE Pretraining for Longitudinal Medical Images |
提出STAMP:一种用于纵向医学图像的随机Siamese MAE预训练框架 |
representation learning MAE foundation model |
|
|
| 4 |
MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling |
提出MS-SSM,一种用于高效序列建模的多尺度状态空间模型 |
SSM state space model |
|
|
| 5 |
Bellman Calibration for V-Learning in Offline Reinforcement Learning |
提出迭代贝尔曼校准方法,用于离线强化学习中V函数预测的校准 |
reinforcement learning offline reinforcement learning |
|
|
| 6 |
Joint Link Adaptation and Device Scheduling Approach for URLLC Industrial IoT Network: A DRL-based Method with Bayesian Optimization |
针对URLLC工业物联网,提出基于贝叶斯优化的DRL联合链路自适应与设备调度方法 |
DRL TD3 |
|
|
| 7 |
Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance |
提出DIR方法,通过信息论优化消除奖励模型中的归纳偏置,提升RLHF性能。 |
reinforcement learning RLHF large language model |
✅ |
|
| 8 |
On the Inverse Flow Matching Problem in the One-Dimensional and Gaussian Cases |
研究一维和高斯分布下的逆流匹配问题,为流匹配模型蒸馏提供理论基础 |
flow matching distillation |
|
|
| 9 |
Diffusion-based Decentralized Federated Multi-Task Representation Learning |
提出基于扩散的去中心化联邦多任务表征学习算法,解决数据稀缺环境下的特征提取问题。 |
representation learning |
|
|
| 10 |
Efficient Deep Learning for Short-Term Solar Irradiance Time Series Forecasting: A Benchmark Study in Ho Chi Minh City |
针对短时太阳辐照度预测,论文提出Transformer模型并结合知识蒸馏实现高效部署。 |
Mamba MAE distillation |
|
|
| 11 |
Flow Matching Neural Processes |
提出基于Flow Matching的神经过程模型,提升条件分布采样效率与精度。 |
flow matching |
|
|
| 12 |
SB-TRPO: Towards Safe Reinforcement Learning with Hard Constraints |
SB-TRPO:面向硬约束安全强化学习,动态平衡成本降低与奖励提升 |
reinforcement learning |
|
|