| 10 |
Distilling Time Series Foundation Models for Efficient Forecasting |
提出DistilTS,用于高效蒸馏时序基础模型以实现高效预测。 |
distillation foundation model |
✅ |
|
| 11 |
On the Relation of State Space Models and Hidden Markov Models |
统一框架对比隐马尔可夫模型与状态空间模型,桥接控制理论、概率建模与深度学习。 |
Mamba SSM state space model |
|
|
| 12 |
Balancing Classification and Calibration Performance in Decision-Making LLMs via Calibration Aware Reinforcement Learning |
提出校准感知强化学习,平衡决策LLM的分类性能与校准置信度 |
reinforcement learning large language model |
|
|
| 13 |
Analysis of Long Range Dependency Understanding in State Space Models |
针对S4D模型,提出首个基于核解释性的长程依赖理解分析方法,应用于源代码漏洞检测。 |
SSM state space model |
|
|
| 14 |
Training instability in deep learning follows low-dimensional dynamical principles |
提出统一的动态视角,研究深度学习训练过程中的不稳定性问题 |
reinforcement learning large language model |
|
|
| 15 |
Recursive Meta-Distillation: An Axiomatic Framework for Iterative Knowledge Refinement |
提出递归元蒸馏框架,为迭代知识精炼提供公理化理论基础。 |
distillation |
|
|
| 16 |
Knowledge-Integrated Representation Learning for Crypto Anomaly Detection under Extreme Label Scarcity; Relational Domain-Logic Integration with Retrieval-Grounded Context and Path-Level Explanations |
提出RDLI框架,解决加密货币异常检测中标签稀缺和对抗性攻击问题 |
representation learning |
|
|
| 17 |
Distribution-Centric Policy Optimization Dominates Exploration-Exploitation Trade-off |
提出分布中心策略优化(DCPO),解决LLM强化学习中探索-利用难题 |
reinforcement learning large language model |
✅ |
|
| 18 |
Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization |
提出基于熵正则化的逆博弈论框架,用于竞争博弈中的奖励函数重构。 |
reinforcement learning inverse reinforcement learning |
|
|