| 1 |
Early Stopping Chain-of-thoughts in Large Language Models |
提出ES-CoT,通过提前停止CoT生成降低大语言模型推理成本 |
large language model chain-of-thought |
|
|
| 2 |
Simulating a Bias Mitigation Scenario in Large Language Models |
构建模拟框架,评估缓解大语言模型偏见的策略 |
large language model |
|
|
| 3 |
Annotating Training Data for Conditional Semantic Textual Similarity Measurement using Large Language Models |
利用大型语言模型重标注条件语义文本相似度训练数据 |
large language model |
✅ |
|
| 4 |
Do Large Language Models Understand Word Senses? |
评估大型语言模型对词义理解能力,并验证其在词义消歧任务上的有效性。 |
large language model |
|
|
| 5 |
Large Language Models Discriminate Against Speakers of German Dialects |
大型语言模型对德语方言使用者存在歧视性偏见 |
large language model |
|
|
| 6 |
How Can Quantum Deep Learning Improve Large Language Models? |
探索量子深度学习在提升大型语言模型适应性方面的潜力 |
large language model |
|
|
| 7 |
AssoCiAm: A Benchmark for Evaluating Association Thinking while Circumventing Ambiguity |
AssoCiAm:提出一个用于评估联想思维并规避歧义的基准 |
large language model multimodal |
|
|
| 8 |
Enhancing Time Awareness in Generative Recommendation |
提出GRUT模型,通过时间感知提升生成式推荐效果 |
large language model TAMP |
✅ |
|
| 9 |
Estimating Semantic Alphabet Size for LLM Uncertainty Quantification |
提出改进的语义字母表大小估计器,提升LLM不确定性量化的准确性和可解释性 |
large language model |
|
|
| 10 |
Ticket-Bench: A Kickoff for Multilingual and Regionalized Agent Evaluation |
Ticket-Bench:多语言区域化Agent评估基准,提升任务型Agent的文化适应性 |
large language model |
|
|
| 11 |
Correct-Detect: Balancing Performance and Ambiguity Through the Lens of Coreference Resolution in LLMs |
揭示LLM在共指消解中性能与歧义检测的权衡:Correct-Detect 框架 |
large language model |
|
|
| 12 |
Causal-Counterfactual RAG: The Integration of Causal-Counterfactual Reasoning into RAG |
提出因果-反事实RAG,将因果推理融入RAG以提升知识密集型任务性能。 |
large language model |
|
|
| 13 |
Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings |
提出一种利用大型语言模型增强心理语言学规范数据集的方法 |
large language model |
|
|
| 14 |
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments |
Apertus:构建开放、合规且支持全球语言环境的大语言模型 |
large language model |
|
|
| 15 |
ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution |
ShinkaEvolve:提出一种高效、开源的程序演化框架,用于解决科学发现中的样本效率问题。 |
large language model |
|
|
| 16 |
Enhancing Multi-Agent Debate System Performance via Confidence Expression |
提出ConfMAD框架,通过置信度表达提升多智能体辩论系统性能 |
large language model |
|
|
| 17 |
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale |
提出Hala模型以提升阿拉伯语指令与翻译任务的性能 |
instruction following |
|
|
| 18 |
Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs |
评估大语言模型在社会偏见场景下的人类价值观对齐程度,并分析其解释能力 |
large language model |
|
|
| 19 |
Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks |
利用认知复杂性框架表征LLM基准测试中知识图谱任务 |
large language model |
|
|
| 20 |
Exploring Data and Parameter Efficient Strategies for Arabic Dialect Identifications |
探索数据与参数高效的阿拉伯语方言识别策略 |
large language model |
|
|
| 21 |
Thinking in a Crowd: How Auxiliary Information Shapes LLM Reasoning |
研究辅助信息对LLM推理的影响:有害信息会显著降低模型性能 |
large language model |
✅ |
|
| 22 |
Implementing a Logical Inference System for Japanese Comparatives |
提出ccg-jcomp:一个基于组合语义的日语比较句逻辑推理系统 |
large language model |
|
|
| 23 |
DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning |
提出DSPC双阶段渐进压缩框架,无需训练即可高效压缩长文本上下文。 |
large language model |
|
|