| 1 |
Anticipating Innovation Using Large Language Models |
提出TechToken模型,利用专利语言预测未来技术组合创新。 |
large language model |
|
|
| 2 |
Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models |
提出SemGrad,一种基于语义梯度的无采样LLM不确定性量化方法,提升效率和准确性。 |
large language model |
|
|
| 3 |
Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals |
提出基于内部注意力发散信号的大语言模型幻觉检测方法 |
large language model |
|
|
| 4 |
Adapting Large Language Models to a Low-Resource Agglutinative Language: A Comparative Study of LoRA and QLoRA for Bashkir |
针对低资源黏着语巴什基尔语,对比LoRA与QLoRA微调大语言模型的效果。 |
large language model |
|
|
| 5 |
Misaligned by Reward: Socially Undesirable Preferences in LLMs |
揭示奖励模型社会偏见:LLM中社会不良偏好的评估框架 |
large language model instruction following |
|
|
| 6 |
Assessing Cognitive Effort in L2 Idiomatic Processing: An Eye-Tracking Dataset |
构建眼动追踪数据集,评估二语习得者成语理解的认知负荷 |
large language model |
|
|
| 7 |
PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation |
提出基于集成Gemma模型和合成数据增强的多语言极化检测方法,在SemEval-2026任务中取得优异成绩。 |
large language model |
|
|
| 8 |
UniVer: A Unified Perspective for Multi-step and Multi-draft Speculative Decoding |
提出UniVer,通过条件最优传输统一多步多草稿推测解码,提升LLM推理效率。 |
large language model |
|
|
| 9 |
StoryAlign: Evaluating and Training Reward Models for Story Generation |
StoryAlign提出StoryReward模型和StoryRMB基准,提升故事生成中人类偏好对齐 |
large language model |
✅ |
|
| 10 |
GEM: Graph-Enhanced Mixture-of-Experts with ReAct Agents for Dialogue State Tracking |
GEM:图增强混合专家模型,结合ReAct智能体,提升对话状态追踪性能 |
large language model |
|
|
| 11 |
Beyond Semantics: An Evidential Reasoning-Aware Multi-View Learning Framework for Trustworthy Mental Health Prediction |
提出一种基于证据推理的多视角学习框架,用于可信赖的精神健康预测。 |
large language model |
|
|
| 12 |
The Pinocchio Dimension: Phenomenality of Experience as the Primary Axis of LLM Psychometric Differences |
通过心理测量揭示LLM的“匹诺曹维度”,区分体验丰富度和行为反应 |
large language model |
|
|
| 13 |
Why Expert Alignment Is Hard: Evidence from Subjective Evaluation |
研究专家对齐的困难:主观评估中的证据揭示专家判断的异质性与不稳定性 |
large language model |
|
|
| 14 |
BenCSSmark: Making the Social Sciences Count in LLM Research |
BenCSSmark:通过社会科学任务提升LLM的评估与泛化能力 |
large language model |
|
|
| 15 |
Elicitation Matters: How Prompts and Query Protocols Shape LLM Surrogates under Sparse Observations |
研究提示词和查询协议如何影响稀疏观测下LLM代理模型的性能 |
large language model |
|
|
| 16 |
Paraphrase-Induced Output-Mode Collapse: When LLMs Break Character Under Semantically Equivalent Inputs |
揭示大语言模型在语义等价输入下输出模式崩溃问题,并提出PARACONSIST基准进行评估。 |
large language model |
|
|
| 17 |
Graph-Augmented LLMs for Swiss MP Ideology Prediction |
提出PG-RAG框架,利用图增强LLM预测瑞士议员意识形态立场 |
large language model |
|
|