| 1 |
Greener Deep Reinforcement Learning: Analysis of Energy and Carbon Efficiency Across Atari Benchmarks |
评估深度强化学习算法在Atari游戏中能源消耗与碳排放效率,为绿色AI提供基准。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 2 |
Deep Reinforcement Learning for Ranking Utility Tuning in the Ad Recommender System at Pinterest |
提出DRL-PUT框架,利用深度强化学习优化Pinterest广告推荐系统中排序效用函数。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 3 |
FinXplore: An Adaptive Deep Reinforcement Learning Framework for Balancing and Discovering Investment Opportunities |
FinXplore:一种自适应深度强化学习框架,用于平衡和发现投资机会 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 4 |
Beyond I-Con: Exploring New Dimension of Distance Measures in Representation Learning |
Beyond I-Con:探索表征学习中距离度量的新维度,提升聚类与降维效果 |
representation learning contrastive learning |
|
|
| 5 |
Self-Aligned Reward: Towards Effective and Efficient Reasoners |
提出自对齐奖励(SAR),提升LLM推理精度与效率,降低计算成本。 |
reinforcement learning PPO large language model |
|
|
| 6 |
An Arbitration Control for an Ensemble of Diversified DQN variants in Continual Reinforcement Learning |
提出ACED-DQN,通过仲裁控制多样化DQN集成解决持续强化学习中的灾难性遗忘问题 |
reinforcement learning deep reinforcement learning |
|
|
| 7 |
MambaLite-Micro: Memory-Optimized Mamba Inference on MCUs |
MambaLite-Micro:面向MCU的内存优化Mamba模型推理引擎 |
Mamba |
|
|
| 8 |
PLanTS: Periodicity-aware Latent-state Representation Learning for Multivariate Time Series |
PLanTS:提出周期感知的潜在状态表征学习框架,用于多元时间序列分析。 |
representation learning |
|
|
| 9 |
SpikingBrain: Spiking Brain-inspired Large Models |
SpikingBrain:受脑启发的线性注意力大模型,提升长文本处理效率。 |
linear attention large language model |
|
|
| 10 |
Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning |
提出基于转移后继测度的低秩强化学习方法,提升目标条件RL性能 |
reinforcement learning |
|
|
| 11 |
Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning |
提出Pre-Forgettable模型,通过Prompt学习实现模型原生可遗忘性,解决数据隐私合规问题。 |
distillation foundation model |
|
|
| 12 |
Topology-Aware Graph Reinforcement Learning for Dynamic Routing in Cloud Networks |
提出拓扑感知图强化学习,解决云网络动态路由优化问题 |
reinforcement learning |
|
|