| 1 |
PCoT: Persuasion-Augmented Chain of Thought for Detecting Fake News and Social Media Disinformation |
提出PCoT:一种基于说服增强的思维链方法,用于检测虚假新闻和社交媒体上的不实信息。 |
large language model chain-of-thought |
|
|
| 2 |
Quantile Regression with Large Language Models for Price Prediction |
提出基于LLM的分位数回归方法,用于提升价格预测的准确性和不确定性量化 |
large language model |
✅ |
|
| 3 |
SafeLawBench: Towards Safe Alignment of Large Language Models |
提出SafeLawBench,从法律视角评估大语言模型的安全性对齐。 |
large language model |
|
|
| 4 |
Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems |
TCGBench:评估大语言模型在竞赛级编程问题中生成可靠测试用例生成器的能力 |
large language model |
|
|
| 5 |
What Makes a Good Natural Language Prompt? |
提出一种以属性和人为中心的框架,用于评估和优化自然语言提示。 |
large language model |
|
|
| 6 |
Mixture of Small and Large Models for Chinese Spelling Check |
提出混合大小模型方法,提升中文拼写检查性能 |
large language model |
✅ |
|
| 7 |
Right Is Not Enough: The Pitfalls of Outcome Supervision in Training LLMs for Math Reasoning |
提出ParaStepVerifier,用于数学推理LLM的细粒度步骤验证,解决奖励欺骗问题 |
large language model |
|
|
| 8 |
Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models |
Trans-PEFT:一种可迁移的参数高效微调方法,适应不断演进的基础模型 |
large language model |
|
|
| 9 |
They want to pretend not to understand: The Limits of Current LLMs in Interpreting Implicit Content of Political Discourse |
揭示LLM在理解政治语篇中隐性含义的局限性,基于IMPAQTS语料库进行评估 |
large language model |
✅ |
|
| 10 |
Dynamic and Parametric Retrieval-Augmented Generation |
综述动态与参数化检索增强生成(RAG)技术,提升LLM知识整合能力 |
large language model |
|
|
| 11 |
Psychological Counseling Cannot Be Achieved Overnight: Automated Psychological Counseling Through Multi-Session Conversations |
提出MusPsy-Dataset和MusPsy-Model,实现基于多轮对话的自动化心理咨询 |
large language model |
|
|
| 12 |
BriefMe: A Legal NLP Benchmark for Assisting with Legal Briefs |
提出BriefMe法律NLP基准,辅助法律文书撰写,包含摘要、补全和案例检索三项任务。 |
large language model |
|
|