| 1 |
Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning |
利用大语言模型和强化学习优化Top-k推荐的新颖性 |
reinforcement learning large language model |
|
|
| 2 |
PostMark: A Robust Blackbox Watermark for Large Language Models |
提出PostMark:一种针对大型语言模型的鲁棒黑盒水印方案,无需访问模型logits。 |
distillation large language model |
✅ |
|
| 3 |
Revealing Vision-Language Integration in the Brain with Multimodal Networks |
利用多模态网络揭示大脑中的视觉-语言融合机制 |
contrastive learning multimodal |
|
|
| 4 |
Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing |
提出MODA,通过对比数据共享解决城市多任务离线强化学习中的数据稀疏和异构问题。 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 5 |
Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards |
提出基于贝叶斯逆强化学习的非马尔可夫奖励函数学习方法 |
reinforcement learning inverse reinforcement learning |
|
|
| 6 |
A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms |
提出一种基于控制理论的强化学习方法,提升策略学习的质量与效率。 |
reinforcement learning |
|
|
| 7 |
Advantage Alignment Algorithms |
提出优势对齐算法,解决通用博弈中智能体合作的帕累托次优问题。 |
reinforcement learning large language model |
|
|
| 8 |
DeciMamba: Exploring the Length Extrapolation Potential of Mamba |
DeciMamba:探索Mamba模型在长度外推方面的潜力 |
Mamba |
|
|
| 9 |
Revealing the Learning Process in Reinforcement Learning Agents Through Attention-Oriented Metrics |
提出注意力导向指标ATOMs,揭示强化学习智能体在训练过程中的学习模式。 |
reinforcement learning |
|
|
| 10 |
Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective |
提出MAGI框架,利用对比学习视角下的模块度最大化进行图聚类,提升性能和可扩展性。 |
contrastive learning |
|
|
| 11 |
ME-IGM: Individual-Global-Max in Maximum Entropy Multi-Agent Reinforcement Learning |
ME-IGM:最大熵多智能体强化学习中基于个体-全局最大化原则的算法 |
reinforcement learning |
|
|