| 12 |
Bootstrap Theory of Representational Emergence: Explanatory Insufficiency as a Driver of Representation Learning and World Models |
提出代表性出现的引导理论以解决现有表示不足问题 |
world model world models representation learning |
|
|
| 13 |
CF-JEPA: Mask-free forward prediction with asymmetric encoder utilization for time-series representation learning |
提出CF-JEPA以解决时间序列表示学习中的掩蔽问题 |
JEPA Joint-Embedding Predictive Architecture joint-embedding predictive architecture |
|
|
| 14 |
Time series Foundation Models based on Physics-Informed Synthetic Histories for Cold-Start Photovoltaic Forecasting |
提出基于物理信息合成历史的时间序列基础模型以解决冷启动光伏预测问题 |
MAE foundation model |
|
|
| 15 |
Learning Explicit Behavioral Models with Adaptive Questions and World-Model Probes |
提出显式符号行为模型以解决交互代理适应性不足问题 |
policy learning world model world models |
|
|
| 16 |
On the Geometry of On-Policy Distillation |
提出一种新方法以理解在政策蒸馏中的参数更新几何特性 |
reinforcement learning distillation large language model |
|
|
| 17 |
SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling |
提出SCALE以解决异构集群调度的可扩展性问题 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 18 |
Self-evolving LLM agents with in-distribution Optimization |
提出Q-Evolve框架以解决长时决策中的信用分配问题 |
reinforcement learning policy learning large language model |
|
|
| 19 |
Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation |
提出无监督持续聚类方法以解决灾难性遗忘问题 |
distillation |
|
|
| 20 |
A Held-Out Transition-Pair Falsifier for Long-Horizon Non-Abelian State Tracking |
提出持出转换对伪造器以解决长时间非阿贝尔状态跟踪问题 |
SSM OMOMO |
|
|
| 21 |
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating |
提出SlimSearcher以解决深度研究代理的计算效率问题 |
reinforcement learning reward shaping |
|
|