| 1 |
Do Machines Think Emotionally? Cognitive Appraisal Analysis of Large Language Models |
提出CoRE基准,通过认知评估分析大语言模型的情感推理能力 |
large language model foundation model |
|
|
| 2 |
ASCoT: An Adaptive Self-Correction Chain-of-Thought Method for Late-Stage Fragility in LLMs |
提出ASCoT方法,解决大语言模型推理链中后期脆弱性问题 |
large language model chain-of-thought |
|
|
| 3 |
Pruning Large Language Models by Identifying and Preserving Functional Networks |
提出基于功能网络识别与保留的大语言模型剪枝方法 |
large language model |
✅ |
|
| 4 |
Attention Basin: Why Contextual Position Matters in Large Language Models |
揭示LLM注意力盆地现象,提出AttnRank重排序框架提升上下文学习效果 |
large language model |
|
|
| 5 |
A Multi-Stage Large Language Model Framework for Extracting Suicide-Related Social Determinants of Health |
提出多阶段大语言模型框架,用于抽取与自杀相关的健康社会决定因素 |
large language model |
|
|
| 6 |
A Rose by Any Other Name Would Smell as Sweet: Categorical Homotopy Theory for Large Language Models |
提出基于范畴同伦理论的大语言模型框架,解决语义等价语句概率分布不一致问题 |
large language model |
|
|
| 7 |
LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models |
LLMEval-Fair:提出大规模动态评估框架,解决大语言模型评估中的数据污染和过拟合问题。 |
large language model |
|
|
| 8 |
MultiCheck: Strengthening Web Trust with Unified Multimodal Fact Verification |
MultiCheck:通过统一的多模态事实核查增强Web信任 |
multimodal |
|
|
| 9 |
Understanding and Mitigating Errors of LLM-Generated RTL Code |
提出一种结合领域知识和调试循环的框架,显著提升LLM生成RTL代码的准确率。 |
large language model multimodal |
|
|
| 10 |
FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification |
提出FineDialFact,用于细粒度对话事实核查的基准数据集。 |
large language model chain-of-thought |
|
|
| 11 |
Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression |
提出CGRS方法,通过置信度引导抑制反思,提升大语言模型推理效率。 |
chain-of-thought |
|
|
| 12 |
RTTC: Reward-Guided Collaborative Test-Time Compute |
提出RTTC:一种奖励引导的自适应测试时计算框架,提升LLM在推理时的性能。 |
large language model |
|
|
| 13 |
The World According to LLMs: How Geographic Origin Influences LLMs' Entity Deduction Capabilities |
Geo20Q+:揭示LLM在实体推断中受地理来源影响的偏见 |
large language model |
|
|
| 14 |
LAG: Logic-Augmented Generation from a Cartesian Perspective |
提出LAG:一种基于笛卡尔哲学的逻辑增强生成方法,提升知识密集型任务的准确性和减少幻觉。 |
large language model |
|
|
| 15 |
MyCulture: Exploring Malaysia's Diverse Culture under Low-Resource Language Constraints |
MyCulture:提出马来西亚文化基准,评估低资源语言约束下LLM的文化理解能力。 |
large language model |
|
|
| 16 |
Align, Don't Divide: Revisiting the LoRA Architecture in Multi-Task Learning |
Align-LoRA:通过对齐任务表征,提升LoRA在多任务学习中的性能 |
large language model |
✅ |
|
| 17 |
Evaluation of Finetuned LLMs in AMR Parsing |
通过微调LLM,在AMR解析任务上达到媲美复杂SOTA模型的性能。 |
large language model |
|
|
| 18 |
NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference |
NanoCodec:面向高质量超快速语音LLM推理的低帧率音频编解码器 |
large language model |
|
|
| 19 |
TASE: Token Awareness and Structured Evaluation for Multilingual Language Models |
TASE:多语言模型Token感知与结构化评估基准 |
large language model |
✅ |
|
| 20 |
Decision-Making with Deliberation: Meta-reviewing as a Document-grounded Dialogue |
提出对话代理以提升元评审效率 |
large language model |
✅ |
|
| 21 |
Navigating Through Paper Flood: Advancing LLM-based Paper Evaluation through Domain-Aware Retrieval and Latent Reasoning |
PaperEval:通过领域感知检索和潜在推理,提升LLM在论文评估中的表现 |
large language model |
|
|
| 22 |
BEE-RAG: Balanced Entropy Engineering for Retrieval-Augmented Generation |
BEE-RAG:通过平衡熵工程提升检索增强生成模型性能 |
large language model |
|
|