Spatiotemporal EEG-Based Emotion Recognition Using SAM Ratings from Serious Games with Hybrid Deep Learning

📄 arXiv: 2508.21103v1 📥 PDF

作者: Abdul Rehman, Ilona Heldal, Jerry Chun-Wei Lin

分类: cs.LG, cs.AI

发布日期: 2025-08-28


💡 一句话要点

提出统一的多粒度EEG情感分类框架以解决现有方法的局限性

🎯 匹配领域: 支柱八:物理动画 (Physics-based Animation)

关键词: EEG情感识别 深度学习 多粒度分类 情感计算 游戏数据 LSTM-GRU 特征提取

📋 核心要点

  1. 现有的EEG情感识别方法多集中于二元分类或个体特定分类,缺乏广泛的适用性和推广性。
  2. 本文提出了一种多粒度的EEG情感分类框架,利用GAMEEMO数据集进行情感标签的多维度编码和分类。
  3. LSTM-GRU模型在各项任务中表现优异,二元情感任务F1分数达到0.932,多类和多标签分类准确率分别为94.5%和90.6%。

📝 摘要(中文)

近年来,基于EEG的情感识别在深度学习和经典机器学习方法中取得了显著进展,但大多数研究集中于二元情感预测或特定个体分类,限制了其在实际情感计算系统中的推广。为了解决这一问题,本文提出了一种基于GAMEEMO数据集的统一多粒度EEG情感分类框架。该数据集包含28名受试者在四种情感诱发游戏场景下的14通道EEG记录和连续自我报告的情感评分。我们的方法包括结构化的预处理策略、混合统计和频域特征提取,以及z-score标准化,旨在将原始EEG信号转化为稳健的输入向量。通过多种模型评估,LSTM-GRU模型在二元情感任务中取得了0.932的F1分数,在多类和多标签情感分类中分别达到了94.5%和90.6%的准确率。

🔬 方法详解

问题定义:本文旨在解决现有EEG情感识别方法在分类精度和适用性上的不足,尤其是对二元情感和个体特定分类的局限性。

核心思路:提出一种统一的多粒度情感分类框架,通过对情感标签的多维度编码,增强模型的泛化能力和准确性。

技术框架:整体流程包括数据预处理、特征提取、情感标签编码和模型训练。预处理阶段采用时间窗口分割和z-score标准化,特征提取结合统计和频域方法。

关键创新:最重要的创新在于引入多维度情感标签编码,包括二元分类、多类分类和细粒度多标签表示,显著提升了情感识别的准确性和适用性。

关键设计:采用LSTM-GRU模型作为主要架构,结合随机森林、XGBoost和SVM等多种模型进行对比,优化了超参数设置以提高模型性能。

🖼️ 关键图片

fig_0
fig_1
fig_2

📊 实验亮点

在实验中,LSTM-GRU模型在二元情感任务中取得了0.932的F1分数,显著优于其他模型。在多类和多标签情感分类中,准确率分别达到了94.5%和90.6%,展示了该框架的有效性和优越性。

🎯 应用场景

该研究的潜在应用领域包括情感计算、游戏设计、心理健康监测等。通过准确识别用户情感状态,可以为个性化体验、情感反馈系统和心理干预提供支持,具有重要的实际价值和未来影响。

📄 摘要(原文)

Recent advancements in EEG-based emotion recognition have shown promising outcomes using both deep learning and classical machine learning approaches; however, most existing studies focus narrowly on binary valence prediction or subject-specific classification, which limits generalizability and deployment in real-world affective computing systems. To address this gap, this paper presents a unified, multigranularity EEG emotion classification framework built on the GAMEEMO dataset, which consists of 14-channel EEG recordings and continuous self-reported emotion ratings (boring, horrible, calm, and funny) from 28 subjects across four emotion-inducing gameplay scenarios. Our pipeline employs a structured preprocessing strategy that comprises temporal window segmentation, hybrid statistical and frequency-domain feature extraction, and z-score normalization to convert raw EEG signals into robust, discriminative input vectors. Emotion labels are derived and encoded across three complementary axes: (i) binary valence classification based on the averaged polarity of positive and negative emotion ratings, and (ii) Multi-class emotion classification, where the presence of the most affective state is predicted. (iii) Fine-grained multi-label representation via binning each emotion into 10 ordinal classes. We evaluate a broad spectrum of models, including Random Forest, XGBoost, and SVM, alongside deep neural architectures such as LSTM, LSTM-GRU, and CNN-LSTM. Among these, the LSTM-GRU model consistently outperforms the others, achieving an F1-score of 0.932 in the binary valence task and 94.5% and 90.6% in both multi-class and Multi-Label emotion classification.