| 1 |
Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models |
利用大型语言模型自动进行精神疾病ICD分类 |
large language model |
|
|
| 2 |
Tracing the ongoing emergence of human-like reasoning in Large Language Models |
评估大型语言模型在条件推理中类人推理能力的涌现 |
large language model |
|
|
| 3 |
Towards Context-Invariant Safety Alignment for Large Language Models |
提出锚定不变性正则化(AIR)以提升大语言模型在对抗性语境下的安全性。 |
large language model |
|
|
| 4 |
LASH: Adaptive Semantic Hybridization for Black-Box Jailbreaking of Large Language Models |
提出LASH以解决大型语言模型的黑箱越狱问题 |
large language model |
|
|
| 5 |
Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models |
评估医疗大语言模型幻觉与滥用风险,揭示Web部署模型的安全隐患 |
large language model |
|
|
| 6 |
Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding |
Manga109-v2026:修订漫画109数据集,提升现代漫画理解能力 |
multimodal |
|
|
| 7 |
TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization |
TextReg:通过正则化文本空间优化缓解提示分布过拟合 |
large language model |
|
|
| 8 |
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval |
在消费级硬件上实现GraphRAG,评估本地LLM在医疗EHR模式检索中的性能。 |
large language model |
|
|
| 9 |
The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study |
揭示LLM模拟实验中的干预幻觉:实为观测研究,关注用户漂移带来的偏差。 |
large language model |
|
|
| 10 |
DIVE: Embedding Compression via Self-Limiting Gradient Updates |
DIVE:通过自限制梯度更新实现嵌入压缩,提升小样本检索性能。 |
large language model |
|
|
| 11 |
Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs |
提出 Analytic Agent,利用LLM驱动的企业分析API智能体系统,解决传统Text-to-SQL在企业级应用中的局限性。 |
large language model |
|
|
| 12 |
Strategy-Induct: Task-Level Strategy Induction for Instruction Generation |
Strategy-Induct:一种无需答案的任务级策略诱导指令生成方法 |
large language model |
|
|
| 13 |
Terminal-World: Scaling Terminal-Agent Environments via Agent Skills |
Terminal-World:通过Agent技能扩展终端Agent环境,提升任务执行能力 |
large language model |
|
|
| 14 |
Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning |
提出线性任务向量(LTV)方法,通过分布对齐提升上下文学习的任务向量性能。 |
large language model |
|
|
| 15 |
Leveraging LLMs for Grammar Adaptation: A Study on Metamodel-Grammar Co-Evolution |
提出基于LLM的语法适配方法,解决元模型演化后语法维护的难题。 |
large language model |
|
|
| 16 |
Quantifying the cross-linguistic effects of syncretism on agreement attraction |
利用大型语言模型量化形态同音异义对一致性吸引的跨语言影响 |
large language model |
|
|
| 17 |
"I didn't Make the Micro Decisions": Measuring, Inducing, and Exposing Goal-Level AI Contributions in Collaboration |
提出CoTrace框架,用于衡量人机协作中AI在目标塑造上的贡献 |
large language model |
|
|
| 18 |
Do LLMs Know What Luxembourgish Borrows? Probing Lexical Neology in Low-Resource Multilingual Models |
提出LexNeo-Bench基准,研究LLM在低资源语言中词汇借用和创新能力,并提出知识图谱增强的prompt方法。 |
large language model |
|
|
| 19 |
WCXB: A Multi-Type Web Content Extraction Benchmark |
提出WCXB多类型网页内容提取基准,揭示现有方法在结构化页面上的盲点。 |
large language model |
|
|
| 20 |
LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control |
LoCar:通过细粒度社会语言控制,实现车载助手本地化感知评估 |
large language model |
|
|
| 21 |
GradeLegal: Automated Grading for German Legal Cases |
GradeLegal:利用大型语言模型实现德国法律案例解答的自动评分 |
large language model |
|
|
| 22 |
Cross-lingual robustness of LLM-brain alignment and its computational roots |
研究表明LLM与大脑活动在跨语言场景下具有稳健的空间对齐性,但其计算根源尚不明确。 |
large language model |
|
|
| 23 |
JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media |
JobArabi:构建并分析来自社交媒体的阿拉伯语招聘信息语料库 |
TAMP |
|
|
| 24 |
Assessing socio-economic climate impacts from text data |
提出文本数据分析框架,提升气候灾害社会经济影响评估的准确性和可比性 |
large language model |
|
|
| 25 |
HRM-Text: Efficient Pretraining Beyond Scaling |
提出HRM-Text,通过层级循环模型和任务驱动预训练,显著降低大语言模型的预训练成本。 |
large language model |
|
|