| 1 |
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving |
Pimba:面向后Transformer大语言模型服务的存内计算加速方案 |
SSM state space model linear attention |
|
|
| 2 |
Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps |
提出基于最优传输映射和Wasserstein正则化的离线强化学习方法,解决分布偏移问题。 |
reinforcement learning offline RL offline reinforcement learning |
✅ |
|
| 3 |
GHPO: Adaptive Guidance for Stable and Efficient LLM Reinforcement Learning |
提出GHPO:自适应引导的稳定高效LLM强化学习框架 |
reinforcement learning imitation learning curriculum learning |
|
|
| 4 |
Graph World Model |
提出图世界模型GWM,统一处理非结构化和图结构数据,支持多模态任务。 |
world model foundation model |
✅ |
|
| 5 |
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination |
揭示数据污染对RL微调大模型数学推理能力评估的影响,提出清洁数据集RandomCalculation。 |
reinforcement learning large language model |
|
|
| 6 |
MoCap-Impute: A Comprehensive Benchmark and Comparative Analysis of Imputation Methods for IMU-based Motion Capture Data |
MoCap-Impute:针对IMU运动捕捉数据缺失值插补的综合基准与对比分析 |
MAE IMU-based motion |
|
|
| 7 |
Recognizing Dementia from Neuropsychological Tests with State Space Models |
提出基于状态空间模型的Demenba框架,用于神经心理学测试的痴呆症自动识别。 |
state space model large language model |
|
|
| 8 |
A Generalizable Physics-Enhanced State Space Model for Long-Term Dynamics Forecasting in Complex Environments |
提出Phy-SSM,融合物理知识的状态空间模型,用于复杂环境下的长期动态预测。 |
SSM state space model |
✅ |
|
| 9 |
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction |
提出基于$H^2$最优降阶的深对角状态空间模型压缩方法 |
SSM state space model |
|
|
| 10 |
FusionFactory: Fusing LLM Capabilities with Multi-LLM Log Data |
FusionFactory:融合多LLM日志数据,提升LLM在不同任务上的性能。 |
distillation large language model |
|
|
| 11 |
Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning |
提出FedFD:一种基于特征蒸馏的模型异构联邦学习方法 |
distillation |
|
|
| 12 |
Text-Driven Causal Representation Learning for Source-Free Domain Generalization |
提出TDCRL,通过文本驱动的因果表示学习解决无源域泛化问题 |
representation learning |
|
|
| 13 |
Multi-Armed Sampling Problem and the End of Exploration |
提出多臂采样框架,证明采样无需探索,为熵正则化强化学习等提供理论基础。 |
reinforcement learning RLHF |
|
|