| 1 |
A Practical Introduction to Deep Reinforcement Learning |
提供深度强化学习的实用入门教程以解决学习障碍 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 2 |
Block-Biased Mamba for Long-Range Sequence Processing |
提出B2S6以解决Mamba在长序列处理中的不足 |
Mamba SSM state space model |
|
|
| 3 |
InfoPO: On Mutual Information Maximization for Large Language Model Alignment |
提出InfoPO以解决大语言模型对齐中的过拟合问题 |
direct preference optimization large language model |
|
|
| 4 |
Cost Function Estimation Using Inverse Reinforcement Learning with Minimal Observations |
提出一种迭代逆强化学习算法以优化成本函数 |
reinforcement learning inverse reinforcement learning |
|
|
| 5 |
DyGSSM: Multi-view Dynamic Graph Embeddings with State Space Model Gradient Update |
提出DyGSSM以解决动态图表示学习中信息提取不足问题 |
SSM state space model representation learning |
|
|
| 6 |
DSADF: Thinking Fast and Slow for Decision Making |
提出双系统自适应决策框架以提升RL智能体的决策能力 |
reinforcement learning large language model foundation model |
|
|
| 7 |
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments |
提出高效无结构剪枝框架以解决Mamba模型在资源受限环境中的部署问题 |
Mamba SSM |
|
|
| 8 |
A Multi-scale Representation Learning Framework for Long-Term Time Series Forecasting |
提出多尺度表示学习框架以解决长期时间序列预测问题 |
representation learning MAE |
|
|
| 9 |
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL |
提出FASP框架以解决离线强化学习中的长远安全问题 |
reinforcement learning offline RL |
|
|
| 10 |
Continual Reinforcement Learning via Autoencoder-Driven Task and New Environment Recognition |
提出自编码器驱动的任务与新环境识别以解决持续强化学习问题 |
reinforcement learning |
|
|
| 11 |
Constrained Edge AI Deployment: Fine-Tuning vs Distillation for LLM Compression |
提出基于自蒸馏的LLM压缩方法以应对边缘计算限制 |
distillation |
|
|
| 12 |
Credit Assignment and Efficient Exploration based on Influence Scope in Multi-agent Reinforcement Learning |
提出基于影响范围的信用分配与高效探索方法解决多智能体强化学习问题 |
reinforcement learning |
|
|
| 13 |
SPAT: Sensitivity-based Multihead-attention Pruning on Time Series Forecasting Models |
提出SPAT方法以优化时间序列预测模型的计算效率 |
Mamba MAE |
|
|
| 14 |
Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer |
提出低复杂度推理框架以解决持续学习中的计算成本问题 |
teacher-student distillation |
|
|