| 1 |
Towards Foundation Models for Consensus Rank Aggregation |
提出基于Transformer的Kemeny Transformer,用于高效共识排序聚合。 |
reinforcement learning foundation model |
|
|
| 2 |
Deep Reinforcement Learning for Fano Hypersurfaces |
提出深度强化学习算法以探索Fano超曲面 |
reinforcement learning deep reinforcement learning |
|
|
| 3 |
Effective Distillation to Hybrid xLSTM Architectures |
提出一种有效的蒸馏流程,将大型语言模型提炼到混合xLSTM架构,实现性能匹配甚至超越。 |
distillation large language model |
|
|
| 4 |
Mamba-3: Improved Sequence Modeling using State Space Principles |
Mamba-3:利用状态空间模型原理改进序列建模,提升推理效率与模型质量。 |
Mamba SSM state space model |
|
|
| 5 |
Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities |
研究表明,测试时强化学习易受有害提示注入攻击,导致安全性放大和推理能力下降。 |
reinforcement learning large language model |
|
|
| 6 |
Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies |
提出SafeFQL,结合可达性分析与流策略,解决离线安全强化学习问题 |
reinforcement learning offline RL |
|
|
| 7 |
TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins |
TabKD:通过学习特征箱交互多样性的表格知识蒸馏 |
distillation |
|
|
| 8 |
TrajFlow: Nation-wide Pseudo GPS Trajectory Generation with Flow Matching Models |
TrajFlow:基于Flow Matching模型的全国范围伪GPS轨迹生成 |
flow matching |
|
|
| 9 |
Photonic Quantum-Enhanced Knowledge Distillation |
提出光子量子增强知识蒸馏(PQKD)框架,利用光子电路提升模型压缩性能。 |
distillation |
|
|
| 10 |
Sample-Efficient Hypergradient Estimation for Decentralized Bi-Level Reinforcement Learning |
提出基于Boltzmann协方差技巧的超梯度估计方法,解决去中心化双层强化学习问题。 |
reinforcement learning |
|
|
| 11 |
Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks |
理论分析数据集蒸馏,揭示梯度学习非线性任务的低维表征高效编码 |
distillation |
|
|
| 12 |
DeFRiS: Silo-Cooperative IoT Applications Scheduling via Decentralized Federated Reinforcement Learning |
DeFRiS:通过去中心化联邦强化学习实现Silo协同物联网应用调度 |
reinforcement learning |
|
|
| 13 |
Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling |
提出LOOM-CFM,通过跨Minibatch优化数据-噪声耦合加速Flow模型推理。 |
flow matching distillation |
|
|
| 14 |
Sampling-guided exploration of active feature selection policies |
提出基于采样指导的主动特征选择策略,提升高维数据分类性能并降低特征获取成本。 |
reinforcement learning predictive model |
|
|