BoostLLM: Boosting-inspired LLM Fine-tuning for Few-shot Tabular Classification

作者: Yi-Siang Wang, Kuan-Yu Chen, Yu-Chen Den, Darby Tien-Hao Chang

分类: cs.LG

发布日期: 2026-05-07

备注: 19 pages, 4 figures

💡 一句话要点

提出BoostLLM，通过Boosting思想微调LLM，提升小样本表格分类性能。

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: LLM微调 Boosting算法 表格数据分类 小样本学习 参数高效微调

📋 核心要点

现有LLM在小样本表格数据分类任务中，性能不如梯度提升决策树（GBDT）。
BoostLLM将Boosting思想引入LLM微调，通过训练序列PEFT适配器进行多轮残差优化。
实验表明，BoostLLM在多个数据集和LLM上超越标准微调，媲美或超过XGBoost，优于GPT-4o。

📝 摘要（中文）

大型语言模型（LLM）最近被应用于表格预测，通过将结构化特征序列化为自然语言，但与梯度提升决策树（GBDT）相比，它们在低数据情况下的性能仍然有限。本文重新审视了传统上与树集成相关的Boosting范式，并探讨了它是否可以作为LLM微调的通用训练原则。我们提出了BoostLLM，该框架通过将参数高效微调转化为多轮残差优化过程，将顺序PEFT适配器训练为弱学习器。为了结合表格归纳偏置，BoostLLM集成了决策树路径作为原始特征之外的第二个输入视图；分析表明，路径视图在模型转向特征驱动的表示之前，在早期训练步骤中充当结构化教师。在多个LLM骨干网络和数据集上，BoostLLM实现了优于标准微调的一致改进，在各种样本数量下匹配或超过了XGBoost，并以4B模型优于基于GPT-4o的方法。我们进一步表明，该框架具有可扩展性：与更强的树模型和更长的Boosting周期配对，可以在适当的稳定下产生额外的收益。这些结果表明，Boosting可以作为LLM微调的通用训练原则，尤其是在结构化数据的低数据情况下。

🔬 方法详解

问题定义：论文旨在解决LLM在小样本表格数据分类任务中性能不足的问题。现有方法直接将表格数据序列化输入LLM进行微调，但缺乏对表格数据结构化信息的有效利用，导致在数据量较少时泛化能力较差。

核心思路：论文的核心思路是将Boosting算法的思想引入到LLM的微调过程中。Boosting算法通过迭代训练多个弱学习器，并将它们组合成一个强学习器。BoostLLM借鉴了这一思想，将LLM的微调过程分解为多个轮次，每一轮训练一个参数高效的适配器（PEFT adapter），并将前一轮的结果作为残差进行优化。

技术框架：BoostLLM框架主要包含以下几个模块：1) 特征工程模块：将表格数据进行预处理，包括原始特征和决策树路径特征。2) LLM backbone：使用预训练的LLM作为基础模型。3) PEFT adapter：在每一轮Boosting中，训练一个参数高效的适配器，用于学习残差。4) Boosting迭代模块：进行多轮Boosting迭代，每一轮训练一个PEFT adapter，并将结果累加到最终预测结果中。

关键创新：BoostLLM的关键创新在于将Boosting算法的思想引入到LLM的微调过程中，并结合了表格数据的特点，利用决策树路径作为辅助信息。这种方法可以有效地提高LLM在小样本表格数据分类任务中的性能。此外，使用PEFT适配器进行微调可以降低计算成本，并避免对整个LLM进行微调。

关键设计：BoostLLM的关键设计包括：1) 使用决策树路径作为辅助输入，以提供表格数据的结构化信息。2) 使用PEFT适配器进行微调，以降低计算成本。3) 设计合适的损失函数，以优化每一轮Boosting的结果。4) 通过实验确定合适的Boosting轮数和学习率等超参数。

🖼️ 关键图片

📊 实验亮点

BoostLLM在多个表格数据集上进行了实验，结果表明其性能优于标准微调方法，并能与XGBoost等传统机器学习算法相媲美，甚至在某些情况下超越。例如，在某些数据集上，BoostLLM的性能超过了基于GPT-4o的方法，并且仅使用了一个4B的模型。

🎯 应用场景

BoostLLM可应用于金融风控、医疗诊断、客户关系管理等领域，这些领域通常面临数据量较少但需要精确预测的表格数据分类问题。该方法能够提升LLM在这些场景下的性能，降低对大量标注数据的依赖，具有重要的实际应用价值和潜力。

📄 摘要（原文）

Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees (GBDTs). In this work, we revisit the boosting paradigm, traditionally associated with tree ensembles, and ask whether it can be applied as a general training principle for LLM fine-tuning. We propose BoostLLM, a framework that transforms parameter-efficient fine-tuning into a multi-round residual optimization process by training sequential PEFT adapters as weak learners. To incorporate tabular inductive bias, BoostLLM integrates decision-tree paths as a second input view alongside raw features; analysis reveals that the path view acts as a structured teacher in early training steps before the model shifts toward feature-driven representations. Empirically, BoostLLM achieves consistent improvements over standard fine-tuning across multiple LLM backbones and datasets, matching or surpassing XGBoost across a wide range of shot counts and outperforming GPT-4o-based methods with a 4B model. We further show that the framework scales: pairing with stronger tree models and extended boosting horizons yields additional gains under appropriate stabilization. These results suggest that boosting can serve as a general training principle for LLM fine-tuning, particularly in low-data regimes for structured data.

BoostLLM: Boosting-inspired LLM Fine-tuning for Few-shot Tabular Classification

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理