Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers

作者: Quentin Guimard, Moreno D'Incà, Massimiliano Mancini, Elisa Ricci

分类: cs.CV, cs.AI, cs.LG

发布日期: 2025-04-29

备注: CVPR 2025. Code: https://github.com/mardgui/C2B

💡 一句话要点

提出Classifier-to-Bias (C2B)，实现视觉分类器无监督自动偏见检测。

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 偏见检测 无监督学习 大型语言模型 视觉分类器 公平性 可解释性

📋 核心要点

现有偏见识别方法依赖于标注数据，限制了其应用范围，非专家难以获取或标注相关数据。
C2B框架利用大型语言模型生成偏见提议和图像描述，无需标注数据即可检测模型偏见。
实验表明，C2B能发现超出原始数据集的偏见，且优于依赖标注的现有偏见检测方法。

📝 摘要（中文）

本文提出Classifier-to-Bias (C2B)，这是一个无需任何标注数据的偏见发现框架。C2B仅依赖于分类任务的文本描述来识别目标分类模型中的偏见。该描述被输入到大型语言模型中，以生成偏见提议和相应的描述偏见的标题，以及特定于任务的目标标签。检索模型收集这些标题的图像，然后使用这些图像来评估模型在给定偏见方面的准确性。C2B是免训练的，不需要任何注释，对偏见列表没有约束，并且可以应用于任何分类任务上的任何预训练模型。在两个公开数据集上的实验表明，C2B发现的偏见超出了原始数据集的范围，并且优于最近依赖于特定任务注释的state-of-the-art偏见检测基线，是朝着解决任务无关的无监督偏见检测迈出的有希望的第一步。

🔬 方法详解

问题定义：现有偏见检测方法需要依赖标注数据，这限制了其在实际场景中的应用。许多用户可能无法获取或负担得起标注数据的成本，尤其是在特定领域或任务中。因此，如何实现无需标注数据的偏见检测是一个关键问题。

核心思路：C2B的核心思路是利用大型语言模型 (LLM) 的强大生成能力，根据分类任务的文本描述自动生成偏见提议和相应的图像描述。然后，利用检索模型收集与这些描述相关的图像，并使用这些图像来评估目标分类模型在这些偏见方面的表现。

技术框架：C2B框架主要包含以下几个模块：1) 偏见提议生成模块：使用大型语言模型，输入分类任务的文本描述，生成一系列可能的偏见提议。2) 图像描述生成模块：对于每个偏见提议，使用大型语言模型生成相应的图像描述，这些描述应能体现该偏见。3) 图像检索模块：使用图像描述作为查询，从大规模图像数据集中检索相关的图像。4) 偏见评估模块：使用检索到的图像评估目标分类模型在这些图像上的表现，从而判断模型是否存在该偏见。

关键创新：C2B的关键创新在于它完全摆脱了对标注数据的依赖，实现了无监督的偏见检测。它利用大型语言模型的生成能力，自动生成偏见提议和图像描述，从而能够检测到模型中潜在的、未知的偏见。这与传统的依赖于预定义偏见和标注数据的方法形成了鲜明对比。

关键设计：C2B框架的关键设计包括：1) 如何选择合适的LLM，并设计有效的prompt，以生成高质量的偏见提议和图像描述。2) 如何设计图像检索模块，以保证检索到的图像能够准确地反映偏见。3) 如何设计偏见评估指标，以准确地评估模型在偏见图像上的表现。论文中具体使用的LLM和检索模型以及评估指标的具体选择未知。

🖼️ 关键图片

📊 实验亮点

C2B在两个公开数据集上进行了实验，结果表明C2B能够发现超出原始数据集范围的偏见。此外，C2B的性能优于依赖于任务特定标注的state-of-the-art偏见检测基线，证明了其在无监督偏见检测方面的有效性。具体的性能提升幅度未知。

🎯 应用场景

C2B可应用于各种视觉分类模型的偏见检测，例如人脸识别、图像分类等。它可以帮助开发者在模型部署前发现并缓解潜在的偏见，从而提高模型的公平性和可靠性。此外，C2B还可以用于评估现有模型的偏见程度，为模型的改进提供指导。

📄 摘要（原文）

A person downloading a pre-trained model from the web should be aware of its biases. Existing approaches for bias identification rely on datasets containing labels for the task of interest, something that a non-expert may not have access to, or may not have the necessary resources to collect: this greatly limits the number of tasks where model biases can be identified. In this work, we present Classifier-to-Bias (C2B), the first bias discovery framework that works without access to any labeled data: it only relies on a textual description of the classification task to identify biases in the target classification model. This description is fed to a large language model to generate bias proposals and corresponding captions depicting biases together with task-specific target labels. A retrieval model collects images for those captions, which are then used to assess the accuracy of the model w.r.t. the given biases. C2B is training-free, does not require any annotations, has no constraints on the list of biases, and can be applied to any pre-trained model on any classification task. Experiments on two publicly available datasets show that C2B discovers biases beyond those of the original datasets and outperforms a recent state-of-the-art bias detection baseline that relies on task-specific annotations, being a promising first step toward addressing task-agnostic unsupervised bias detection.

Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理