| 1 |
Cross-Examiner: Evaluating Consistency of Large Language Model-Generated Explanations |
提出Cross-Examiner,用于评估大型语言模型生成解释的一致性 |
large language model |
|
|
| 2 |
Exploring the Word Sense Disambiguation Capabilities of Large Language Models |
探索大型语言模型在词义消歧任务中的能力 |
large language model |
|
|
| 3 |
Exploiting Instruction-Following Retrievers for Malicious Information Retrieval |
揭示指令跟随检索器在恶意信息检索中的安全风险 |
instruction following |
|
|
| 4 |
Position-Aware Depth Decay Decoding ($D^3$): Boosting Large Language Model Inference Efficiency |
提出位置感知深度衰减解码以提升大语言模型推理效率 |
large language model |
|
|
| 5 |
Enhancing Multi-Hop Fact Verification with Structured Knowledge-Augmented Large Language Models |
提出LLM-SKAN模型,利用结构化知识增强LLM在多跳事实核查中的性能 |
large language model |
|
|
| 6 |
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges |
针对智能门诊转诊,提出基于大语言模型的评估框架与基准测试 |
large language model |
|
|
| 7 |
CLEV: LLM-Based Evaluation Through Lightweight Efficient Voting for Free-Form Question-Answering |
CLEV:基于LLM的高效投票评估框架,用于自由形式问答 |
large language model instruction following |
|
|
| 8 |
LLMs Know What to Drop: Self-Attention Guided KV Cache Eviction for Efficient Long-Context Inference |
提出SAGE-KV,利用自注意力指导KV缓存淘汰,提升长文本LLM推理效率。 |
large language model |
|
|
| 9 |
Interpretable and Robust Dialogue State Tracking via Natural Language Summarization with LLMs |
提出基于LLM的自然语言对话状态跟踪(NL-DST),提升开放域对话的鲁棒性和可解释性。 |
large language model |
|
|
| 10 |
NSF-SciFy: Mining the NSF Awards Database for Scientific Claims |
NSF-SciFy:构建大规模科研声明数据集,用于科学发现和评估 |
large language model |
|
|
| 11 |
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process |
DeepReview:通过模拟人类深度思考过程改进基于LLM的论文评审 |
large language model |
|
|
| 12 |
Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling |
提出基于Ngram模型Logit缩放的极端Subword风格迁移方法,提升大语言模型风格控制能力。 |
large language model |
|
|
| 13 |
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems |
ESPnet-SDS:用于语音对话系统的统一工具包与演示平台 |
foundation model |
✅ |
|
| 14 |
ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews |
提出ReviewAgents框架,利用LLM生成高质量学术论文评审意见,缩小与人类评审的差距。 |
large language model |
|
|
| 15 |
Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information |
系统性评估大型语言模型在政治信息核查中的能力与局限性 |
large language model |
|
|
| 16 |
OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning |
OpenRAG:通过上下文检索学习端到端优化RAG,提升检索一致性。 |
large language model |
|
|
| 17 |
Automating Violence Detection and Categorization from Ancient Texts |
利用大型语言模型自动检测和分类古代文本中的暴力行为 |
large language model |
|
|
| 18 |
RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware |
RigoChat 2:利用有限数据集和低硬件资源,为西班牙语定制优化语言模型。 |
large language model |
|
|
| 19 |
OASIS: Order-Augmented Strategy for Improved Code Search |
提出OASIS:一种基于排序增强策略的代码搜索方法,提升代码嵌入质量。 |
large language model |
|
|
| 20 |
Odysseus Navigates the Sirens' Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation |
提出动态焦点解码(DFD),无需额外数据即可提升开放域文本生成的事实性和多样性。 |
large language model |
|
|
| 21 |
Learning to Search Effective Example Sequences for In-Context Learning |
提出基于Beam Search的示例序列构造器(BESC),用于优化上下文学习中的示例选择。 |
large language model |
|
|