| 1 |
Hypothesis Testing Prompting Improves Deductive Reasoning in Large Language Models |
提出假设检验提示方法,提升大语言模型在演绎推理任务中的性能 |
large language model chain-of-thought |
|
|
| 2 |
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought |
提出PedCoT,利用教学式思维链提升LLM数学推理错误识别能力 |
large language model chain-of-thought |
|
|
| 3 |
Enhancing Creativity in Large Language Models through Associative Thinking Strategies |
通过联想思维策略提升大型语言模型的创造力 |
large language model |
|
|
| 4 |
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models |
提出通用声学对抗攻击,使Whisper语音模型忽略语音内容 |
foundation model |
|
|
| 5 |
Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding? |
质疑PPL作为长文本理解评估指标的有效性,揭示其局限性 |
large language model |
|
|
| 6 |
Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses |
评估大型语言模型在常见疾病症状识别中的潜力,为数字诊断提供新思路 |
large language model |
|
|
| 7 |
Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language |
针对萨米语,探索低资源场景下大语言模型训练的包容性AI方法 |
large language model |
|
|
| 8 |
Can large language models understand uncommon meanings of common words? |
构建LeSC数据集,揭示大语言模型在理解常见词语非常见含义方面的不足 |
large language model |
|
|
| 9 |
Exploring the Capabilities of Large Multimodal Models on Dense Text |
提出DT-VQA数据集,探索大型多模态模型在密集文本理解任务中的能力。 |
multimodal |
✅ |
|
| 10 |
Can We Use Large Language Models to Fill Relevance Judgment Holes? |
利用大型语言模型填补相关性判断缺失,扩展测试集以提升检索系统评估的可靠性。 |
large language model |
|
|
| 11 |
Boosting Large Language Models with Continual Learning for Aspect-based Sentiment Analysis |
提出基于大语言模型和持续学习的LLM-CL模型,解决面向方面情感分析中的领域知识迁移和遗忘问题。 |
large language model |
|
|
| 12 |
Unveiling the Competitive Dynamics: A Comparative Evaluation of American and Chinese LLMs |
对比中美大语言模型,揭示语言和任务差异下的性能差距 |
large language model multimodal |
|
|
| 13 |
Natural Language Processing RELIES on Linguistics |
强调语言学在自然语言处理中的持久重要性,应对大语言模型带来的挑战。 |
large language model |
|
|
| 14 |
Towards a Path Dependent Account of Category Fluency |
提出路径依赖的类别流畅性模型,解决认知机制的争议 |
large language model |
|
|
| 15 |
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning |
OpenBA-V2:通过快速多阶段剪枝实现77.3%高压缩率 |
large language model |
|
|
| 16 |
Smurfs: Multi-Agent System using Context-Efficient DFSDT for Tool Planning |
Smurfs:基于上下文高效DFSDT的多智能体工具规划系统 |
large language model |
|
|
| 17 |
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? |
研究表明:在LLM微调中引入新知识会增加幻觉现象 |
large language model |
|
|
| 18 |
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions |
利用大型语言模型预测析取推理,验证其在实验语用学中的应用潜力 |
large language model |
|
|
| 19 |
Exploring the Human-LLM Synergy in Advancing Theory-driven Qualitative Analysis |
提出CHALET,利用人-LLM协同推进理论驱动的定性分析,发现新见解。 |
large language model |
|
|
| 20 |
Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM |
提出CoA:一种语义驱动的上下文多轮攻击方法,用于评估LLM的安全性。 |
large language model |
|
|
| 21 |
OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs |
OpenFactCheck:构建、评估定制化事实核查系统,评估声明和LLM的事实性 |
large language model |
✅ |
|