| 1 |
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags |
提出XL-HeadTags,利用多模态检索增强实现新闻标题和标签的多语言生成。 |
multimodal |
|
|
| 2 |
PaCE: Parsimonious Concept Engineering for Large Language Models |
PaCE:用于大语言模型的简约概念工程,通过激活空间操作实现模型对齐。 |
large language model |
|
|
| 3 |
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness |
提出指针引导预训练方法,增强大语言模型对段落级上下文的理解能力 |
large language model |
|
|
| 4 |
Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model |
利用微调预训练大语言模型进行法律文书起草 |
large language model |
|
|
| 5 |
Confabulation: The Surprising Value of Large Language Model Hallucinations |
重新审视大语言模型幻觉:将其视为提升叙事性和连贯性的潜在资源 |
large language model |
|
|
| 6 |
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models |
提出直接询问LLM偏见来源的方法,量化评估大语言模型中的社会偏见 |
large language model |
|
|
| 7 |
Are Large Language Models the New Interface for Data Pipelines? |
探索大型语言模型作为数据管道新界面的潜力 |
large language model |
|
|
| 8 |
llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models |
llmNER:利用大型语言模型实现零样本和少样本命名实体识别 |
large language model |
|
|
| 9 |
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models |
提出Buffer of Thoughts (BoT),增强LLM推理的准确性、效率和鲁棒性。 |
large language model |
✅ |
|
| 10 |
Benchmark Data Contamination of Large Language Models: A Survey |
综述大型语言模型基准数据污染问题,并探讨缓解策略与未来方向 |
large language model |
|
|
| 11 |
ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models |
提出ValueBench以全面评估大语言模型的价值取向 |
large language model |
✅ |
|
| 12 |
Uncovering Limitations of Large Language Models in Information Seeking from Tables |
提出TabIS基准,揭示大语言模型在表格信息检索中的局限性 |
large language model |
|
|
| 13 |
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models |
利用Whisper和LLM进行基于语音的青少年自杀风险预测 |
large language model |
|
|
| 14 |
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As |
EBMQA基准测试揭示大语言模型在数值医学知识处理上弱于语义知识,且逊于人类专家 |
large language model |
|
|
| 15 |
Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies |
揭示大语言模型在误导性关键词下的谄媚性幻觉并评估防御策略 |
large language model |
|
|
| 16 |
A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions |
医学大语言模型综述:技术、应用、可信赖性与未来方向 |
large language model |
|
|
| 17 |
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model |
提出MMCE-Qformer,利用多模态上下文和LLM改进基于音频编解码器的零样本TTS,适用于长文本语音合成。 |
large language model |
|
|
| 18 |
MAIRA-2: Grounded Radiology Report Generation |
MAIRA-2:提出基于定位信息的放射报告生成模型,提升报告质量与可验证性。 |
large language model multimodal |
|
|
| 19 |
A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential |
提出A+B通用生成-阅读框架,优化LLM以释放协同潜力 |
large language model foundation model |
|
|
| 20 |
BLSP-Emo: Towards Empathetic Large Speech-Language Models |
提出BLSP-Emo,一种支持情感理解的端到端语音-语言预训练模型 |
multimodal instruction following |
|
|
| 21 |
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning |
Light-PEFT:通过早期剪枝减轻参数高效微调的计算负担 |
large language model foundation model |
|
|
| 22 |
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification |
提出LLMEmbed,一种轻量级LLM文本分类迁移学习策略,提升效率与性能。 |
large language model chain-of-thought |
✅ |
|
| 23 |
ArMeme: Propagandistic Content in Arabic Memes |
构建阿拉伯语宣传模因数据集,为检测多模态有害信息提供资源。 |
multimodal |
|
|
| 24 |
Time Sensitive Knowledge Editing through Efficient Finetuning |
提出基于高效微调的时间敏感知识编辑方法,提升LLM知识更新能力。 |
large language model |
|
|
| 25 |
Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models |
提出一种可解释且本体论证的语言模型,旨在克服现有大语言模型的局限性。 |
large language model |
|
|
| 26 |
Transformers need glasses! Information over-squashing in language tasks |
揭示Transformer语言模型的信息过挤压问题,并提出潜在解决方案 |
large language model |
|
|
| 27 |
What Do Language Models Learn in Context? The Structured Task Hypothesis |
通过组合预训练任务,语言模型可实现上下文学习 |
large language model |
|
|
| 28 |
Do Language Models Understand Morality? Towards a Robust Detection of Moral Content |
提出基于大语言模型的零样本道德内容检测方法,提升跨领域鲁棒性 |
large language model |
|
|
| 29 |
Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts |
提出PredEx数据集,提升LLM在印度法律判决预测与解释中的准确性 |
large language model |
|
|
| 30 |
Automatic Bug Detection in LLM-Powered Text-Based Games Using LLMs |
提出基于LLM的自动方法,用于检测LLM驱动的文本游戏中存在的缺陷。 |
large language model |
|
|
| 31 |
MoralBench: Moral Evaluation of LLMs |
MoralBench:用于评估大型语言模型道德推理能力的新基准 |
large language model |
✅ |
|
| 32 |
BEADs: Bias Evaluation Across Domains |
提出BEADs数据集,用于跨领域偏见评估,促进负责任的AI系统开发。 |
large language model |
✅ |
|
| 33 |
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures |
提出常识框架补全任务CFC与概率评估方法,解决常识理解中多答案与偏见问题。 |
large language model |
|
|
| 34 |
Exploring the Latest LLMs for Leaderboard Extraction |
探索大型语言模型在AI研究论文排行榜信息抽取中的应用 |
large language model |
|
|
| 35 |
Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs |
提出平滑控制文本生成属性强度的方法以解决生成一致性问题 |
large language model |
✅ |
|
| 36 |
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages |
研究RNN和Transformer语言模型学习概率正则语言的能力,揭示影响学习的关键因素。 |
large language model |
|
|
| 37 |
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People |
提出一种迭代采样方法,用于表征人类与LLM在对话语气上的异同 |
large language model |
|
|
| 38 |
DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning |
提出DICE以检测大语言模型微调阶段的数据污染问题 |
large language model |
✅ |
|
| 39 |
Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing |
提出相关性释义方法,评估大型语言模型在零样本摘要生成中的鲁棒性 |
large language model |
|
|
| 40 |
Effective Context Selection in LLM-based Leaderboard Generation: An Empirical Study |
提出基于LLM的上下文选择方法,高效生成AI研究排行榜,提升准确率并减少幻觉。 |
large language model |
|
|
| 41 |
NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human |
NAP^2:通过模仿人类行为,实现自然且保护隐私的文本重写基准。 |
large language model |
|
|