| 1 |
Mothman at SemEval-2024 Task 9: An Iterative System for Chain-of-Thought Prompt Optimization |
提出一种迭代式的思维链提示优化系统,提升大语言模型在横向思维任务上的表现。 |
large language model chain-of-thought |
|
|
| 2 |
Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on the Travelling Salesman Problem Using GPT-3.5 Turbo |
探索大型语言模型在组合优化问题中的应用:以GPT-3.5 Turbo求解旅行商问题为例 |
large language model chain-of-thought |
|
|
| 3 |
Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models |
Vibe-Eval:用于评估多模态语言模型的新型高难度评测基准 |
multimodal |
✅ |
|
| 4 |
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning |
提出PICLe框架,通过Persona In-Context Learning引导LLM展现特定人格行为 |
large language model |
✅ |
|
| 5 |
Semantic Scaling: Bayesian Ideal Point Estimates with Large Language Models |
提出Semantic Scaling,利用大语言模型进行文本理想点估计,提升意识形态测量灵活性。 |
large language model |
|
|
| 6 |
Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling |
提出FLAN-FinXC,通过指令调优LLM和LoRA解决金融文档中极端财务数字标签问题。 |
large language model |
|
|
| 7 |
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection |
提出基于不确定性的双层选择方法,优化大语言模型调用成本与性能。 |
large language model |
|
|
| 8 |
Argumentative Large Language Models for Explainable and Contestable Claim Verification |
提出论证型大语言模型(ArgLLMs),用于可解释和可辩驳的声明验证。 |
large language model |
|
|
| 9 |
Large Multimodal Model based Standardisation of Pathology Reports with Confidence and their Prognostic Significance |
提出基于大模型的多模态病理报告标准化框架,并评估其预后意义 |
multimodal |
|
|
| 10 |
Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT |
利用GPT4生成叙事文本,分析BERT在不同风格和内容下的表征差异 |
large language model |
|
|
| 11 |
Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models |
提出依赖感知半结构化稀疏(DaSS)方法,用于压缩GLU变体LLM。 |
large language model |
|
|
| 12 |
Attribution in Scientific Literature: New Benchmark and Methods |
提出REASONS数据集,解决科学文献中LLM自动引用时的幻觉问题。 |
large language model |
|
|
| 13 |
Assessing and Verifying Task Utility in LLM-Powered Applications |
AgentEval:用于评估和验证LLM应用任务效用的新框架 |
large language model |
|
|
| 14 |
What does the Knowledge Neuron Thesis Have to do with Knowledge? |
质疑知识神经元假说:大型语言模型知识存储并非仅依赖MLP权重 |
large language model |
|
|
| 15 |
Conformal Prediction for Natural Language Processing: A Survey |
综述:针对自然语言处理任务的共形预测技术及其应用 |
large language model |
|
|
| 16 |
DALLMi: Domain Adaption for LLM-based Multi-label Classifier |
DALLMi:针对LLM多标签分类器的领域自适应方法,解决目标域标签不全和训练开销大的问题。 |
large language model |
|
|