Steganography Without Modification: Hidden Communication via LLM Seeds

作者: Felix Mächtle, Jonas Sander, Sebastian Berndt, Ben Weimar, Nils Loose, Thomas Eisenbarth

分类: cs.CR, cs.AI

发布日期: 2026-06-08

备注: To appear in the Proceedings of the International Conference on Availability, Reliability and Security (ARES 2026)

💡 一句话要点

提出无修改的隐写通信方法以利用LLM种子

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 隐写通信 大型语言模型 伪随机数生成器 信息隐藏 安全通信

📋 核心要点

现有的隐写通信方法通常依赖于对模型的修改，限制了其应用范围和灵活性。
论文提出了一种利用LLM种子进行隐写通信的方法，无需对模型进行任何修改，利用PRNG的结构特性进行信息编码和解码。
实验结果显示，在已知提示下，32位种子的恢复准确率可达100%，而在未知提示下，恢复准确率在600-800个令牌时也接近完美。

📝 摘要（中文）

本研究展示了广泛部署的大型语言模型（LLM）推理堆栈中存在一种隐写通道，该通道无需修改模型权重、采样代码或输出分布。该通道利用确定性解码的结构特性：用于逆变换采样的伪随机数生成器（PRNG）产生的种子依赖的令牌级概率区间序列，可以仅通过生成的文本重构。发送者在生成前将秘密消息编码在PRNG种子中；接收者通过对种子空间的穷举搜索重构区间并恢复种子，从而获取隐藏的有效载荷。我们形式化了两种操作模式：已知提示和未知提示设置。实验表明，在已知提示设置中，完整32位种子的恢复准确率可达100%。在未知提示设置中，恢复准确率在600-800个令牌时接近完美。

🔬 方法详解

问题定义：本研究旨在解决现有隐写通信方法对模型修改的依赖性，限制了其灵活性和广泛应用。

核心思路：通过利用大型语言模型中伪随机数生成器（PRNG）种子产生的概率区间，发送者可以在生成文本时隐写信息，而接收者则通过重构这些区间来恢复信息。

技术框架：该方法分为两个主要阶段：首先，发送者在生成文本前将信息编码到PRNG种子中；其次，接收者通过对生成文本进行分析，重构概率区间并恢复种子。

关键创新：本研究的创新在于提出了一种无需修改模型的隐写通信方法，利用LLM的结构特性实现信息的隐蔽传输，突破了传统隐写方法的局限。

关键设计：在已知提示设置中，采用强制对齐的方法实现精确的区间重构；而在未知提示设置中，结合最大命中计数评分策略进行近似重构，确保信息的可靠恢复。实验中还分析了提示策略、标记化模糊性和采样超参数对通道可靠性的影响。

🖼️ 关键图片

📊 实验亮点

实验结果显示，在已知提示设置下，完整32位种子的恢复准确率可达100%，在未知提示设置下，恢复准确率在600-800个令牌时接近完美，展示了该方法在不同模型和文本领域的强大性能。

🎯 应用场景

该研究的潜在应用领域包括安全通信、数据隐私保护和信息隐藏等。通过无修改的隐写通信方法，可以在不改变现有模型的情况下实现安全的信息传输，具有重要的实际价值和广泛的应用前景。

📄 摘要（原文）

We demonstrate that widely deployed Large Language Model (LLM) inference stacks harbor a steganographic channel that requires no modification to model weights, sampling code, or output distributions. The channel exploits a structural property of deterministic decoding: pseudo-random number generators (PRNGs) used in inverse-transform sampling produce a seed-dependent sequence of token-level probability intervals that can be reconstructed from the generated text alone. A sender encodes a secret message in the PRNG seed before generation; a receiver reconstructs the intervals and recovers the seed, and thus the hidden payload, by exhaustive search over the seed space. We formalize two operational modes. In the known-prompt setting, sender and receiver share the prompt, enabling exact interval reconstruction and perfect seed recovery via forced alignment. In the unknown-prompt setting, only the generated text is available; approximate interval reconstruction combined with a maximum-hit-count scoring strategy still permits reliable recovery from sufficiently long outputs. Extensive experiments across six model families and five heterogeneous text domains show that, in the known-prompt setting, full 32-bit seed recovery from the complete 2^32 candidate space achieves up to 100% accuracy, depending on model and text domain, within 300 tokens and under 35 seconds on a single GPU. In the unknown-prompt setting, recovery reaches near-perfect accuracy at 600-800 tokens in about 12 seconds. We further analyze the influence of prompting strategies, tokenization ambiguities, and sampling hyperparameters on channel reliability. Moreover, we discuss several applications of our results: First, it allows for the steganographic transmission of 32 bits, but also shows that ignorance of the prompt is not a valid security assumption.

Steganography Without Modification: Hidden Communication via LLM Seeds

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理