| 1 |
A Deep Reinforcement Learning Approach to Automated Stock Trading, using xLSTM Networks |
提出基于xLSTM网络的深度强化学习股票交易方法,提升长期依赖建模能力。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 2 |
A Survey of Direct Preference Optimization |
DPO综述:直接偏好优化方法,提升LLM对齐效率与稳定性 |
reinforcement learning RLHF DPO |
✅ |
|
| 3 |
Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment |
提出MoSARe模型,解决医疗多模态数据缺失下的鲁棒表示问题 |
contrastive learning multimodal |
✅ |
|
| 4 |
Language-Enhanced Representation Learning for Single-Cell Transcriptomics |
提出scMMGPT,用于单细胞转录组学中语言增强的表征学习。 |
representation learning large language model multimodal |
|
|
| 5 |
The Pitfalls of Imitation Learning when Actions are Continuous |
揭示连续动作空间模仿学习的局限性,并探索改进策略 |
offline RL imitation learning behavior cloning |
|
|
| 6 |
Temporal Difference Flows |
提出TD-Flow,通过概率路径上的贝尔曼方程和流匹配技术,学习长时域精确的几何视界模型。 |
flow matching world model predictive model |
|
|
| 7 |
Strategyproof Reinforcement Learning from Human Feedback |
提出Pessimistic Median of MLEs算法,解决RLHF中策略性反馈导致的策略偏差问题 |
reinforcement learning RLHF |
|
|
| 8 |
Rule-Guided Reinforcement Learning Policy Evaluation and Improvement |
LEGIBLE:一种规则引导的强化学习策略评估与改进方法 |
reinforcement learning deep reinforcement learning |
|
|
| 9 |
Implicit Contrastive Representation Learning with Guided Stop-gradient |
提出引导式停止梯度方法,提升自监督对比学习的稳定性和性能 |
representation learning contrastive learning |
✅ |
|
| 10 |
Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping |
提出DRMARL框架,解决亚马逊仓库动态分拣中对不确定诱导率的鲁棒映射问题 |
reinforcement learning |
|
|
| 11 |
Privacy-Preserved Automated Scoring using Federated Learning for Educational Research |
提出基于联邦学习的隐私保护自动评分框架,用于教育评估研究。 |
MAE large language model |
|
|
| 12 |
ConjointNet: Enhancing Conjoint Analysis for Preference Prediction with Representation Learning |
ConjointNet:利用表征学习增强联合分析,提升偏好预测精度 |
representation learning |
|
|
| 13 |
Towards Causal Model-Based Policy Optimization |
提出C-MBPO,通过因果模型提升模型基强化学习的泛化性和鲁棒性 |
reinforcement learning policy learning predictive model |
|
|
| 14 |
Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach |
提出基于强化学习的加速器控制优化框架,提升束线性能 |
reinforcement learning |
|
|
| 15 |
Reinforcement Learning is all You Need |
利用纯强化学习训练3B语言模型,提升推理能力。 |
reinforcement learning |
|
|
| 16 |
Evaluating Reinforcement Learning Safety and Trustworthiness in Cyber-Physical Systems |
提出SAFE-RL框架,用于评估和提升强化学习在信息物理系统中的安全性和可信度 |
reinforcement learning |
|
|
| 17 |
Representation Retrieval Learning for Heterogeneous Data Integration |
提出表征检索学习框架R2,解决异构数据集成中的预测性能下降问题。 |
predictive model representation learning |
|
|