Empirical Computation

作者: Eric Tang, Marcel Böhme

分类: cs.SE, cs.AI

发布日期: 2025-03-13

备注: Open challenges in the analysis of properties and limits of empirical computation

💡 一句话要点

探索经验计算：一种基于经验而非形式化方法的计算范式

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 经验计算 大型语言模型 计算范式 软件工程 人工智能

📋 核心要点

传统计算依赖形式化方法，需要预先定义算法和数据结构，限制了其灵活性和适用性。
经验计算利用大型语言模型直接解决问题，无需特定算法和数据格式，更具通用性。
经验计算的能力和局限性无法用传统计算理论解释，需要新的理论框架和评估方法。

📝 摘要（中文）

本文探讨了一种计算形式的挑战和机遇，该计算形式采用经验（而非形式化）方法，其中计算问题的解决方案以经验上最有可能（而非必然正确）的方式返回。我们将这种方法称为经验计算，并观察到其能力和局限性无法在经典的理性主义计算框架内理解。虽然我们对“计算问题”采取了非常广泛的看法，但一个经典的、经过充分研究的例子是排序：给定一组$n$个数字，按升序返回这些数字。与传统的形式化计算不同，经验计算可以直接使用大型语言模型（LLM）来解决任何计算问题，输入格式也可以是任意的。本文旨在将经验计算确立为软件工程领域中一个及时且充满有趣问题的领域。

🔬 方法详解

问题定义：论文旨在探索一种新的计算范式，即“经验计算”。传统计算方法依赖于预先定义好的算法和数据结构，例如排序问题需要先确定排序算法（如归并排序），再编写特定程序，并对输入数据类型有严格要求。这种方式在处理复杂、非结构化问题时存在局限性。现有方法的痛点在于缺乏灵活性和通用性，难以应对自然语言描述的问题和各种数据格式。

核心思路：论文的核心思路是利用大型语言模型（LLM）的强大能力，直接将计算问题交给LLM解决，而无需预先设计算法和程序。LLM可以理解自然语言描述的问题，并处理各种格式的输入数据。这种方式将计算过程视为一个黑盒，关注的是经验上的正确性，而非形式上的证明。

技术框架：论文并没有提出一个具体的系统架构，而是在概念层面探讨了经验计算的特性。其核心在于利用现有的LLM作为计算引擎，将问题描述和输入数据直接输入LLM，然后获取LLM的输出作为计算结果。整个流程简化了传统计算的复杂性，降低了对算法和数据结构的依赖。

关键创新：论文最重要的技术创新点在于提出了“经验计算”这一新的计算范式。与传统计算方法相比，经验计算更加灵活、通用，可以处理非结构化问题和各种数据格式。此外，经验计算的时间复杂度与问题本身的计算复杂度无关，而是取决于LLM的推理能力。

关键设计：论文没有涉及具体的参数设置、损失函数或网络结构等技术细节，因为其重点在于概念层面的探讨。关键在于如何有效地利用LLM解决各种计算问题，以及如何评估经验计算的正确性和可靠性。未来的研究可以探索如何优化LLM的prompt，以及如何设计新的评估指标来衡量经验计算的性能。

🖼️ 关键图片

📊 实验亮点

由于是愿景论文，没有提供具体的实验结果。论文的核心价值在于提出了“经验计算”这一新的概念，并阐述了其与传统计算方法的区别。论文指出，经验计算的能力和局限性无法用传统计算理论解释，需要新的理论框架和评估方法。未来的研究可以探索如何评估经验计算的性能，以及如何优化LLM在经验计算中的应用。

🎯 应用场景

经验计算具有广泛的应用前景，例如可以用于自然语言处理、图像识别、机器人控制等领域。它可以简化复杂问题的求解过程，降低开发成本，并提高计算的灵活性和通用性。未来，经验计算有望成为一种重要的计算范式，推动人工智能和软件工程的发展。

📄 摘要（原文）

In this vision paper, we explore the challenges and opportunities of a form of computation that employs an empirical (rather than a formal) approach, where the solution of a computational problem is returned as empirically most likely (rather than necessarily correct). We call this approach as empirical computation and observe that its capabilities and limits cannot be understood within the classic, rationalist framework of computation. While we take a very broad view of "computational problem", a classic, well-studied example is sorting: Given a set of $n$ numbers, return these numbers sorted in ascending order. * To run a classical, formal computation, we might first think about a specific algorithm (e.g., merge sort) before developing a specific program that implements it. The program will expect the input to be given in a specific format, type, or data structure (e.g., unsigned 32-bit integers). In software engineering, we have many approaches to analyze the correctness of such programs. From complexity theory, we know that there exists no correct program that can solve the average instance of the sorting problem faster than $O(n\log n)$. * To run an empirical computation, we might directly ask a large language model (LLM) to solve any computational problem (which can be stated informally in natural language) and provide the input in any format (e.g., negative numbers written as Chinese characters). There is no (problem-specific) program that could be analyzed for correctness. Also, the time it takes an LLM to return an answer is entirely independent of the computational complexity of the problem that is solved. What are the capabilities or limits of empirical computation in the general, in the problem-, or in the instance-specific? Our purpose is to establish empirical computation as a field in SE that is timely and rich with interesting problems.

Empirical Computation

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理