| 1 |
Deep Reinforcement Learning for Phishing Detection with Transformer-Based Semantic Features |
提出基于Transformer语义特征的QR-DQN深度强化学习方法,用于提升钓鱼网站检测的准确性和泛化性。 |
reinforcement learning deep reinforcement learning |
|
|
| 2 |
Statistical analysis of Inverse Entropy-regularized Reinforcement Learning |
提出基于熵正则化逆强化学习的统计分析框架,解决奖励函数非唯一性问题。 |
reinforcement learning behavior cloning inverse reinforcement learning |
|
|
| 3 |
Parent-Guided Semantic Reward Model (PGSRM): Embedding-Based Reward Functions for Reinforcement Learning of Transformer Language Models |
提出Parent-Guided Semantic Reward Model,用于Transformer语言模型的强化学习。 |
reinforcement learning PPO RLHF |
|
|
| 4 |
State Diversity Matters in Offline Behavior Distillation |
提出状态密度加权离线行为蒸馏算法,提升状态多样性以改善策略学习。 |
offline RL distillation |
|
|
| 5 |
Always Keep Your Promises: DynamicLRP, A Model-Agnostic Solution To Layer-Wise Relevance Propagation |
提出DynamicLRP,一种模型无关的逐层相关性传播解决方案 |
Mamba multimodal |
✅ |
|
| 6 |
Adaptive Normalization Mamba with Multi Scale Trend Decomposition and Patch MoE Encoding |
提出AdaMamba,通过自适应归一化和多尺度趋势分解增强时间序列预测的稳定性和准确性。 |
Mamba |
|
|
| 7 |
Know your Trajectory -- Trustworthy Reinforcement Learning deployment through Importance-Based Trajectory Analysis |
提出基于重要性的轨迹分析方法,提升强化学习部署的可信度 |
reinforcement learning |
|
|