CF-JEPA: Mask-free forward prediction with asymmetric encoder utilization for time-series representation learning

📄 arXiv: 2606.07031v1 📥 PDF

作者: Jaehoon Lee, Sunghyun Sim

分类: cs.LG

发布日期: 2026-06-05


💡 一句话要点

提出CF-JEPA以解决时间序列表示学习中的掩蔽问题

🎯 匹配领域: 支柱二:RL算法与架构 (RL & Architecture)

关键词: 自监督学习 时间序列分析 前向预测 多视角学习 异常检测 电力负荷预测 深度学习

📋 核心要点

  1. 现有的自监督学习方法在时间序列表示学习中面临构建样本对的挑战,且掩蔽方法破坏了时间序列的连续性。
  2. CF-JEPA提出了一种无掩蔽的框架,通过多视角前向预测来替代掩蔽,充分利用时间序列的时间顺序。
  3. 在多个数据集上,CF-JEPA在分类和预测任务中表现出色,尤其在多变量预测中显著降低了均方误差。

📝 摘要(中文)

自监督学习(SSL)在时间序列表示学习中主要面临两种范式:对比方法在构建正负样本对时存在挑战,而基于掩蔽的方法则破坏了时间序列信号的连续性。联合嵌入预测架构(JEPA)通过在表示空间中进行预测提供了一个有前景的替代方案。然而,现有的时间序列JEPA变体仍依赖于掩蔽,因此继承了其连续性问题。本文提出的基于裁剪的前向JEPA(CF-JEPA)是一种创新的无掩蔽框架,通过多视角前向预测替代掩蔽,直接利用时间序列数据的内在时间顺序作为学习信号。通过利用在线编码器和指数移动平均(EMA)目标编码器之间的强不对称性,CF-JEPA在多变量预测均方误差(MSE)上实现了27%的降低,且无需额外训练成本。

🔬 方法详解

问题定义:本文旨在解决时间序列表示学习中现有方法依赖掩蔽所带来的连续性问题,掩蔽方法在时间序列信号中造成了信息丢失和不连贯性。

核心思路:CF-JEPA通过多视角前向预测替代掩蔽,利用随机裁剪作为上下文视图,预测短期、中期和长期的未来表示,从而保持时间序列的连续性。

技术框架:CF-JEPA的整体架构包括在线编码器和EMA目标编码器,在线编码器负责分类任务,而EMA目标编码器则用于预测和异常检测,二者均来自同一训练过程。

关键创新:CF-JEPA的主要创新在于无掩蔽的前向预测机制和在线编码器与EMA目标编码器之间的强不对称性,这使得模型能够在不同任务中发挥各自优势。

关键设计:在设计中,在线编码器生成高阶判别特征,而EMA目标编码器生成平滑的低阶时间特征,分类任务通过在线编码器进行,而预测和异常检测则通过EMA目标编码器完成。

📊 实验亮点

CF-JEPA在126个加州大学河滨分校(UCR)和26个东英吉利大学(UEA)分类数据集上表现优异,在电力变压器温度预测基准测试中实现了最高的平均准确率,并在多变量预测中将均方误差降低了27%。

🎯 应用场景

CF-JEPA在时间序列数据分析中具有广泛的应用潜力,尤其适用于电力负荷预测、金融市场分析和工业设备监控等领域。其无掩蔽的学习方式能够更好地捕捉时间序列的动态变化,提升模型的预测准确性和鲁棒性。

📄 摘要(原文)

Self-supervised learning (SSL) for time-series representation learning is dominated by two paradigms: contrastive methods, which face challenges in constructing positive or negative pairs, and masking-based methods, which disrupt the temporal continuity of time-series signals. Joint-Embedding Predictive Architecture (JEPA) offers a promising alternative by predicting in representation space rather than reconstructing raw inputs. However, existing time-series JEPA variants still rely on masking and therefore inherit its continuity problem. Crop-based Forward JEPA (CF-JEPA) is proposed as an innovative mask-free framework that replaces masking with multi-horizon forward prediction: random crops serve as context views, and short-, mid-, and long-horizon future representations are predicted in the forward temporal direction, directly leveraging the inherent temporal ordering of time-series data as a learning signal. A strong asymmetry is also identified between the online encoder and the exponential moving average (EMA) target encoder, both produced from a single training run: the online encoder develops higher-rank discriminative features, while the EMA target encoder develops smoother, lower-rank temporal features. Exploiting this asymmetry, classification is routed to the online encoder and forecasting or anomaly detection to the EMA target encoder, achieving a 27% reduction in multivariate forecasting mean squared error (MSE) at no additional training cost. Across 126 University of California, Riverside (UCR) and 26 University of East Anglia (UEA) classification datasets, eight electricity transformer temperature forecasting benchmarks, and Key Performance Indicator /Yahoo anomaly detection, CF-JEPA achieves the highest average accuracy and rank on UCR and UEA among self-supervised baselines and ranks second on univariate forecasting and k-nearest neighbors-scored anomaly detection.