Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning

作者: Michele Joshua Maggini, Dhia Merzougui, Rabiraj Bandyopadhyay, Gaël Dias, Fabrice Maurel, Pablo Gamallo

分类: cs.CL, cs.AI

发布日期: 2025-09-09

💡 一句话要点

对比In-Context Learning与微调，评估大语言模型在检测有害内容方面的能力。

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 大语言模型 有害内容检测 In-Context Learning 微调 多语言处理

📋 核心要点

在线平台上的虚假新闻、极端言论和有害内容日益泛滥，现有方法难以有效应对多语言和多类型的内容检测挑战。
本文对比了In-Context Learning和微调两种范式，探索大语言模型在检测有害内容方面的能力，并针对不同语言和数据集进行了实验。
实验结果表明，在检测有害内容方面，微调后的模型通常优于In-Context Learning，即使是较小的模型也能胜过大型模型。

📝 摘要（中文）

本文全面评估了不同大语言模型在检测网络平台上的虚假新闻、极端言论、政治偏见和有害内容方面的性能。研究涵盖了10个数据集和5种语言（英语、西班牙语、葡萄牙语、阿拉伯语和保加利亚语），涉及二元和多类分类场景。实验对比了参数高效的微调方法和多种In-Context Learning策略，包括零样本提示、代码本、少样本学习（使用随机选择和行列式点过程选择的样本）以及思维链。研究发现，In-Context Learning的性能通常不如微调模型。这一主要发现强调了即使是较小的模型，在特定任务上进行微调的重要性，即使与在In-Context Learning设置中评估的最大模型（如LlaMA3.1-8b-Instruct、Mistral-Nemo-Instruct-2407和Qwen2.5-7B-Instruct）相比也是如此。

🔬 方法详解

问题定义：论文旨在解决在线平台上虚假新闻、极端言论、政治偏见和有害内容难以有效检测的问题。现有方法在跨语言、跨数据集的泛化能力以及对不同类型有害内容的识别方面存在不足，难以适应快速变化的网络环境。

核心思路：论文的核心思路是通过对比In-Context Learning和微调两种范式，评估大语言模型在检测有害内容方面的性能。通过实验分析，确定哪种方法更适合于处理此类任务，并为实际应用提供指导。

技术框架：论文的技术框架主要包括以下几个部分：1) 数据集准备：收集并整理包含多种语言和不同类型有害内容的数据集；2) 模型选择：选择一系列具有代表性的大语言模型，如LlaMA3.1-8b-Instruct、Mistral-Nemo-Instruct-2407和Qwen2.5-7B-Instruct；3) 方法实现：实现In-Context Learning和微调两种方法，并针对不同数据集和语言进行优化；4) 实验评估：使用准确率、召回率等指标评估模型的性能，并进行对比分析。

关键创新：论文的关键创新在于对In-Context Learning和微调两种范式进行了全面的对比评估，并揭示了微调在检测有害内容方面的优势。此外，论文还探索了不同的In-Context Learning策略，如零样本提示、代码本、少样本学习和思维链，并分析了它们对模型性能的影响。

关键设计：在In-Context Learning方面，论文尝试了不同的提示策略，包括零样本提示、代码本、少样本学习（使用随机选择和行列式点过程选择的样本）以及思维链。在微调方面，论文采用了参数高效的微调方法，以降低计算成本。具体的参数设置和损失函数根据不同的数据集和模型进行了调整。

📊 实验亮点

实验结果表明，在检测有害内容方面，微调后的模型通常优于In-Context Learning。即使是较小的模型，通过微调也能胜过大型模型在In-Context Learning设置下的表现。这一发现强调了在特定任务上进行微调的重要性，为实际应用提供了有价值的参考。

🎯 应用场景

该研究成果可应用于在线内容审核、舆情分析、虚假信息检测等领域，帮助平台更有效地识别和过滤有害信息，维护健康的网络环境。通过优化模型和方法，可以提升内容审核的效率和准确性，降低人工审核的成本，并为用户提供更安全可靠的信息服务。

📄 摘要（原文）

The spread of fake news, polarizing, politically biased, and harmful content on online platforms has been a serious concern. With large language models becoming a promising approach, however, no study has properly benchmarked their performance across different models, usage methods, and languages. This study presents a comprehensive overview of different Large Language Models adaptation paradigms for the detection of hyperpartisan and fake news, harmful tweets, and political bias. Our experiments spanned 10 datasets and 5 different languages (English, Spanish, Portuguese, Arabic and Bulgarian), covering both binary and multiclass classification scenarios. We tested different strategies ranging from parameter efficient Fine-Tuning of language models to a variety of different In-Context Learning strategies and prompts. These included zero-shot prompts, codebooks, few-shot (with both randomly-selected and diversely-selected examples using Determinantal Point Processes), and Chain-of-Thought. We discovered that In-Context Learning often underperforms when compared to Fine-Tuning a model. This main finding highlights the importance of Fine-Tuning even smaller models on task-specific settings even when compared to the largest models evaluated in an In-Context Learning setup - in our case LlaMA3.1-8b-Instruct, Mistral-Nemo-Instruct-2407 and Qwen2.5-7B-Instruct.

Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理