| 16 |
LearnAlign: Reasoning Data Selection for Reinforcement Learning in Large Language Models Based on Improved Gradient Alignment |
提出LearnAlign以解决大语言模型强化学习中的数据选择问题 |
reinforcement learning large language model |
|
|
| 17 |
Visual Pre-Training on Unlabeled Images using Reinforcement Learning |
提出基于强化学习的无标签图像预训练方法以提升特征学习 |
reinforcement learning visual pre-training |
|
|
| 18 |
Automated Treatment Planning for Interstitial HDR Brachytherapy for Locally Advanced Cervical Cancer using Deep Reinforcement Learning |
提出基于深度强化学习的自动化HDR近距离放疗计划框架以解决宫颈癌治疗问题 |
reinforcement learning deep reinforcement learning |
|
|
| 19 |
Growing with Experience: Growing Neural Networks in Deep Reinforcement Learning |
提出GrowNN以解决深度强化学习中网络训练困难问题 |
reinforcement learning deep reinforcement learning |
|
|
| 20 |
Task-Driven Discrete Representation Learning |
提出任务驱动的离散表示学习框架以提升下游任务性能 |
DRL representation learning VQ-VAE |
|
|
| 21 |
Brewing Knowledge in Context: Distillation Perspectives on In-Context Learning |
提出知识蒸馏视角以理解上下文学习机制 |
distillation large language model |
|
|
| 22 |
Understanding Input Selectivity in Mamba: Impact on Approximation Power, Memorization, and Associative Recall Capacity |
揭示Mamba中的输入选择性对近似能力和记忆的影响 |
Mamba SSM |
|
|
| 23 |
From Emergence to Control: Probing and Modulating Self-Reflection in Language Models |
提出反思诱导探测方法以增强语言模型自我反思能力 |
reinforcement learning large language model |
|
|
| 24 |
Interpretable representation learning of quantum data enabled by probabilistic variational autoencoders |
提出基于变分自编码器的量子数据可解释表示学习方法 |
representation learning |
|
|
| 25 |
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search |
提出TreeRL框架以解决传统RL方法的探索不足问题 |
reinforcement learning |
✅ |
|
| 26 |
Attention-based Adversarial Robust Distillation in Radio Signal Classifications for Low-Power IoT Devices |
提出基于注意力的对抗鲁棒蒸馏方法以解决低功耗IoT设备中的信号分类问题 |
distillation |
|
|
| 27 |
ReVeal: Self-Evolving Code Agents via Reliable Self-Verification |
提出ReVeal以解决自我验证不可靠的问题 |
reinforcement learning large language model |
|
|
| 28 |
An Explainable AI Framework for Dynamic Resource Management in Vehicular Network Slicing |
提出可解释的深度强化学习框架以解决车载网络切片中的动态资源管理问题 |
reinforcement learning deep reinforcement learning |
|
|