| 1 |
VLM-KG: Multimodal Radiology Knowledge Graph Generation |
提出VLM-KG框架,首个用于生成多模态放射学知识图谱的方案。 |
multimodal instruction following |
|
|
| 2 |
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions |
综述:大型语言模型在立场检测中的应用、方法、挑战与未来方向 |
large language model multimodal |
|
|
| 3 |
Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping |
提出Adaptive GoGI-Skip框架,通过动态跳过实现高效的CoT推理加速。 |
large language model chain-of-thought |
|
|
| 4 |
ALOHA: Empowering Multilingual Agent for University Orientation with Hierarchical Retrieval |
ALOHA:一种基于分层检索的多语言智能体,用于增强大学迎新服务。 |
large language model Aloha |
|
|
| 5 |
HealthBench: Evaluating Large Language Models Towards Improved Human Health |
HealthBench:用于评估大型语言模型在医疗健康领域的性能与安全性的基准测试。 |
large language model instruction following |
|
|
| 6 |
Enhancing Thyroid Cytology Diagnosis with RAG-Optimized LLMs and Pa-thology Foundation Models |
结合RAG优化LLM与病理学基础模型,提升甲状腺细胞学诊断水平 |
large language model foundation model |
|
|
| 7 |
Aya Vision: Advancing the Frontier of Multilingual Multimodality |
Aya Vision:通过数据合成与模型融合,推进多语言多模态前沿 |
multimodal |
|
|
| 8 |
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models |
提出基于LLM的比较式论文评分方法LCES,实现零样本自动论文评分。 |
large language model |
|
|
| 9 |
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement |
综述LLM心理测量学:利用心理测量学评估、验证和提升大语言模型 |
large language model |
✅ |
|
| 10 |
HCR-Reasoner: Synergizing Large Language Models and Theory for Human-like Causal Reasoning |
HCR-Reasoner:融合大语言模型与因果理论,实现类人因果推理 |
large language model |
|
|
| 11 |
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies |
揭示大语言模型概率一致性偏差:理论完备性与实证差异分析 |
large language model |
|
|
| 12 |
NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context |
NurValues:构建临床情境下大型语言模型护理价值观对齐评估基准 |
large language model |
✅ |
|
| 13 |
Small but Significant: On the Promise of Small Language Models for Accessible AIED |
探索小型语言模型在可访问AI教育中的潜力,解决资源受限机构的AI工具可及性问题。 |
large language model |
|
|
| 14 |
Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression for Scalable Knowledge Integration |
提出自适应上下文压缩的缓存增强生成框架,提升大规模知识集成效率 |
large language model |
|
|
| 15 |
A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs |
提出预训练不确定性量化头,用于检测LLM输出中的幻觉 |
large language model |
|
|
| 16 |
A suite of LMs comprehend puzzle statements as well as humans |
大型语言模型在理解谜题语句方面与人类表现相当甚至超越 |
large language model |
|
|
| 17 |
Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation |
提出ASEE框架,结合模式释义与检索增强生成,解决事件抽取中模式选择与幻觉问题。 |
large language model |
|
|
| 18 |
Automatic Task Detection and Heterogeneous LLM Speculative Decoding |
提出异构LLM推测解码方法,提升下游任务效率与加速LLM推理。 |
large language model |
|
|
| 19 |
LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries |
LibVulnWatch:利用Agent深度评估开源AI库的潜在安全风险 |
large language model |
|
|
| 20 |
IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation |
IterKey:利用LLM迭代生成关键词,增强检索增强生成效果 |
large language model |
|
|
| 21 |
A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court |
构建意大利最高法院判决主题建模数据集的文档处理流水线 |
large language model |
|
|
| 22 |
TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers |
TUMS:利用多结构处理器增强LLM的工具使用能力 |
large language model |
|
|
| 23 |
Towards Contamination Resistant Benchmarks |
提出一种抗污染的LLM评测基准,解决现有评测的可靠性问题。 |
large language model |
|
|
| 24 |
Evaluating the Effectiveness of Black-Box Prompt Optimization as the Scale of LLMs Continues to Grow |
评估黑盒提示优化方法在大规模LLM上的有效性,发现其收益递减 |
large language model |
|
|