| 1 |
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models |
提出基于困惑度的逐步优化方法,提升大语言模型CoT推理效率 |
large language model chain-of-thought |
|
|
| 2 |
Beyond Words: Exploring Cultural Value Sensitivity in Multimodal Models |
评估多模态模型中的文化价值观敏感性,揭示其与文化价值对齐的复杂性。 |
large language model multimodal |
|
|
| 3 |
Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning |
提出基于监督式思维链推理的长文本理解方法,并构建金融领域合成数据集LongFinanceQA。 |
large language model chain-of-thought |
|
|
| 4 |
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection |
提出一种鲁棒的大型多模态模型自适应框架,用于检索增强的仇恨模因检测。 |
multimodal |
✅ |
|
| 5 |
When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models |
提出一种结合词级和文档级信号的新方法,利用大型语言模型分析移民讨论中的隐喻 |
large language model |
|
|
| 6 |
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models |
UniGuardian:用于检测大语言模型中提示注入、后门攻击和对抗攻击的统一防御机制 |
large language model |
|
|
| 7 |
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models |
STEER-ME:评估大型语言模型在微观经济学推理方面的能力 |
large language model |
|
|
| 8 |
Towards Text-Image Interleaved Retrieval |
提出文本-图像交错检索任务与MME模型,解决多图文场景下的信息检索问题。 |
large language model multimodal |
|
|
| 9 |
Language Models Can Predict Their Own Behavior |
利用语言模型内部表征,无需生成token即可预测其行为,降低风险和加速推理。 |
instruction following chain-of-thought |
|
|
| 10 |
Grounding LLM Reasoning with Knowledge Graphs |
提出基于知识图谱的LLM推理框架,提升推理准确性和可解释性 |
large language model chain-of-thought |
|
|
| 11 |
Natural Language Generation from Visual Events: State-of-the-Art and Key Open Questions |
综述视觉事件到自然语言生成:分析现有方法并探讨关键开放问题 |
multimodal |
|
|
| 12 |
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation |
LLMPopcorn:探索大语言模型辅助生成高流量微视频的潜力与方法 |
large language model |
|
|
| 13 |
Language Models are Few-Shot Graders |
提出基于LLM的自动短答案评分(ASAG)流程,提升评分准确性和效率。 |
large language model |
|
|
| 14 |
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors |
提出Trace-and-Verify框架,用于训练基于LLM的对话式代码辅导Agent。 |
large language model |
|
|
| 15 |
Evaluating and Enhancing Out-of-Domain Generalization of Task-Oriented Dialog Systems for Task Completion without Turn-level Dialog Annotations |
提出ZeroToD框架,提升零样本任务型对话系统在未见领域的任务完成度。 |
large language model |
|
|
| 16 |
Improving Multi-turn Task Completion in Task-Oriented Dialog Systems via Prompt Chaining and Fine-Grained Feedback |
RealTOD框架通过提示链和细粒度反馈,显著提升面向任务对话系统中多轮任务完成的可靠性。 |
large language model |
|
|
| 17 |
Multilingual Language Model Pretraining using Machine-translated Data |
利用机器翻译数据预训练多语言模型,显著提升非英语语言性能。 |
large language model |
|
|
| 18 |
Neural Attention Search |
提出神经注意力搜索(NAtS)框架,用于降低Transformer模型推理时KV缓存大小,从而降低推理成本。 |
large language model |
|
|
| 19 |
RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises |
RuozhiBench:构建逻辑谬误和误导性前提的评测基准,评估LLM的推理能力 |
large language model |
|
|
| 20 |
Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context |
研究LLM对性别包容性语言的理解:揭示核心指代中的性别偏见 |
large language model |
|
|