| 1 |
Personalized Prediction of Perceived Message Effectiveness Using Large Language Model Based Digital Twins |
利用大语言模型数字孪生进行个性化消息有效性预测,提升移动健康干预效果 |
large language model |
|
|
| 2 |
To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering |
提出选择性思维链(Selective CoT)方法,提升医学问答效率并降低计算成本。 |
large language model chain-of-thought |
|
|
| 3 |
Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval |
首个多模态文档智能综述:聚焦视觉文档检索与多模态大语言模型 |
large language model multimodal |
|
|
| 4 |
Multilingual Large Language Models do not comprehend all natural languages to equal degrees |
揭示多语言大模型对不同自然语言理解能力差异,挑战英语最佳表现的预设 |
large language model |
|
|
| 5 |
Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming |
提出基于模拟的临床红队测试框架,评估大语言模型在心理健康支持中的风险 |
large language model |
|
|
| 6 |
Entropy in Large Language Models |
通过熵分析比较大型语言模型与自然语言的差异 |
large language model |
|
|
| 7 |
Sculpting the Vector Space: Towards Efficient Multi-Vector Visual Document Retrieval via Prune-then-Merge Framework |
提出Prune-then-Merge框架以解决多向量视觉文档检索效率问题 |
multimodal |
|
|
| 8 |
NanoKnow: How to Know What Your Language Model Knows |
NanoKnow:构建基准数据集,探究LLM参数知识来源及外部知识互补性 |
large language model |
✅ |
|
| 9 |
Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference |
提出Pyramid MoA,通过动态路由降低大语言模型推理成本,提升性价比。 |
large language model |
|
|
| 10 |
ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting |
提出ReAttn:通过注意力重加权改进基于注意力的重排序方法 |
large language model |
|
|
| 11 |
Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously |
提出边缘对齐,解决通用对齐在复杂社会技术系统中存在的局限性 |
large language model |
|
|
| 12 |
gencat: Generative computerized adaptive testing |
提出GENCAT:一种利用生成式大语言模型的自适应测试框架 |
large language model |
|
|
| 13 |
SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation |
提出SAMAS,通过频谱引导的多Agent系统提升文学翻译中的风格保真度。 |
large language model |
|
|
| 14 |
KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge |
KGHaluBench:基于知识图谱的大语言模型幻觉评测基准,评估知识的广度和深度 |
large language model |
|
|
| 15 |
Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning |
提出DUAL基准以解决机器遗忘中的知识来源问题 |
large language model |
|
|