AI Integrity: A New Paradigm for Verifiable AI Governance

作者: Seulki Lee

分类: cs.AI

发布日期: 2026-04-13

备注: 13 pages, 8 tables

DOI: 10.5281/zenodo.18861026

💡 一句话要点

提出AI完整性以解决AI治理中的可验证性问题

🎯 匹配领域: 支柱一：机器人控制 (Robot Control)

关键词: AI完整性 可验证性 推理过程 权威层级 PRISM框架 决策支持 透明性

📋 核心要点

现有的AI治理方法主要关注结果评估，缺乏对推理过程的验证，导致潜在的决策不透明和不可信。
本文提出AI完整性概念，强调保护AI系统的权威层级，确保推理过程的透明性和可审计性。
通过PRISM框架，本文定义了六个核心指标，为AI系统的完整性提供了可操作的测量方法。

📝 摘要（中文）

随着AI系统在医疗、法律、国防和教育等高风险决策中的日益重要，现有的治理范式（如AI伦理、AI安全和AI对齐）存在一个共同的局限性：它们评估结果而非验证推理过程。本文提出了AI完整性这一概念，定义为AI系统的权威层级（Authority Stack）在不受腐蚀、污染、操控和偏见影响的情况下被保护并可验证的状态。我们将AI完整性与现有三种范式区分开来，定义了一个基于四层级联模型的权威层级，并提出了PRISM框架作为操作方法，定义了六个核心指标和阶段性研究路线图。

🔬 方法详解

问题定义：本文旨在解决现有AI治理方法在推理过程验证方面的不足，现有方法主要关注结果而忽视了推理的透明性和可审计性。

核心思路：提出AI完整性作为一种程序性概念，强调AI系统从证据到结论的路径必须是透明和可审计的，而不论系统持有什么样的价值观。

技术框架：整体架构包括四层级联模型（规范层、认知层、来源层和数据层），每一层都基于已有的学术框架进行定义和构建。

关键创新：AI完整性与现有的AI伦理、AI安全和AI对齐范式的本质区别在于，它关注的是推理过程的可验证性，而非仅仅是结果的正确性。

关键设计：PRISM框架中定义了六个核心指标，具体参数设置和测量方法尚未详细披露，未来研究将进一步明确这些技术细节。

📊 实验亮点

本文提出的PRISM框架为AI完整性提供了可操作的测量方法，定义的六个核心指标能够有效评估AI系统的推理过程透明性。具体的性能数据和对比基线尚未披露，未来研究将进一步验证这些指标的有效性。

🎯 应用场景

该研究的潜在应用领域包括医疗决策支持系统、法律判决辅助工具、教育评估系统等。通过确保AI系统的推理过程透明且可审计，能够提高决策的可信度和社会接受度，进而推动AI技术的广泛应用。

📄 摘要（原文）

AI systems increasingly shape high-stakes decisions in healthcare, law, defense, and education, yet existing governance paradigms -- AI Ethics, AI Safety, and AI Alignment -- share a common limitation: they evaluate outcomes rather than verifying the reasoning process itself. This paper introduces AI Integrity, a concept defined as a state in which the Authority Stack of an AI system -- its layered hierarchy of values, epistemological standards, source preferences, and data selection criteria -- is protected from corruption, contamination, manipulation, and bias, and maintained in a verifiable manner. We distinguish AI Integrity from the three existing paradigms, define the Authority Stack as a 4-layer cascade model (Normative, Epistemic, Source, and Data Authority) grounded in established academic frameworks -- Schwartz Basic Human Values for normative authority, Walton argumentation schemes with GRADE/CEBM hierarchies for epistemic authority, and Source Credibility Theory for source authority -- characterize the distinction between legitimate cascading and Authority Pollution, and identify Integrity Hallucination as the central measurable threat to value consistency. We further specify the PRISM (Profile-based Reasoning Integrity Stack Measurement) framework as the operational methodology, defining six core metrics and a phased research roadmap. Unlike normative frameworks that prescribe which values are correct, AI Integrity is a procedural concept: it requires that the path from evidence to conclusion be transparent and auditable, regardless of which values a system holds.

AI Integrity: A New Paradigm for Verifiable AI Governance

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理