| 1 |
LRT-Diffusion: Calibrated Risk-Aware Guidance for Diffusion Policies |
LRT-Diffusion:用于离线强化学习中具有校准风险意识的扩散策略引导方法 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 2 |
Greedy Sampling Is Provably Efficient for RLHF |
针对通用偏好模型的RLHF,提出贪婪采样算法并证明其高效性 |
reinforcement learning RLHF large language model |
|
|
| 3 |
HiMAE: Hierarchical Masked Autoencoders Discover Resolution-Specific Structure in Wearable Time Series |
HiMAE:分层掩码自编码器发现可穿戴时间序列中特定分辨率的结构 |
representation learning masked autoencoder foundation model |
|
|
| 4 |
SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation |
SpatialTraceGen:高效VLM空间推理蒸馏的高保真轨迹生成 |
reinforcement learning offline reinforcement learning distillation |
|
|
| 5 |
Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks |
提出双脑世界模型,解决动态无线网络中数据低效和泛化性差的问题。 |
reinforcement learning world model model-based RL |
|
|
| 6 |
Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning |
提出基于模拟增强强化学习的非近视匹配与重平衡算法,提升大规模按需拼车系统效率。 |
reinforcement learning spatiotemporal |
|
|
| 7 |
PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling |
PaTaRM:通过偏好感知任务自适应奖励建模桥接成对和点式信号,提升RLHF性能 |
reinforcement learning RLHF large language model |
✅ |
|
| 8 |
Enhancing Hierarchical Reinforcement Learning through Change Point Detection in Time Series |
提出基于Transformer的变点检测模块,增强分层强化学习在长时任务中的可扩展性。 |
reinforcement learning |
|
|
| 9 |
Eigenfunction Extraction for Ordered Representation Learning |
提出特征函数提取框架,用于有序表征学习,提升特征选择的效率和准确性。 |
representation learning |
|
|
| 10 |
Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning |
提出感知学习以解决决策学习与感知表示学习的分离问题 |
representation learning |
|
|
| 11 |
Causal-Aware Generative Adversarial Networks with Reinforcement Learning |
提出CA-GAN,利用因果图和强化学习生成高质量、保护隐私的表格数据。 |
reinforcement learning |
|
|
| 12 |
Predicting Barge Tow Size on Inland Waterways Using Vessel Trajectory Derived Features: Proof of Concept |
提出一种基于AIS数据的机器学习方法,用于预测内河航道驳船数量,提升水域感知能力。 |
MAE spatiotemporal |
|
|
| 13 |
Sample-efficient and Scalable Exploration in Continuous-Time RL |
提出COMBRL算法,解决连续时间强化学习中的样本效率和可扩展性问题。 |
reinforcement learning model-based RL |
|
|
| 14 |
A Novel XAI-Enhanced Quantum Adversarial Networks for Velocity Dispersion Modeling in MaNGA Galaxies |
提出XAI增强的量子对抗网络,用于星系速度弥散建模 |
predictive model MAE |
|
|