| 1 |
Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models |
Typhoon 2:一系列面向泰语的开源文本和多模态大语言模型 |
large language model multimodal |
|
|
| 2 |
Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data |
利用生成式AI标注数据,对比BERT类模型与大语言模型在假新闻检测中的性能。 |
large language model |
|
|
| 3 |
Prompting Strategies for Enabling Large Language Models to Infer Causation from Correlation |
提出PC-SubQ提示策略,提升大语言模型基于相关性推断因果关系的能力 |
large language model |
|
|
| 4 |
Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models' Character Understanding Evaluation |
提出一种缓解LLM在角色理解评估中死记硬背的方法,揭示并减轻逐字记忆的影响。 |
large language model |
|
|
| 5 |
Hansel: Output Length Controlling Framework for Large Language Models |
Hansel:一种用于大语言模型输出长度控制的框架 |
large language model |
|
|
| 6 |
MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data |
提出MATCHED多模态作者归属方法,打击在线护送广告中的人口贩卖 |
multimodal |
|
|
| 7 |
Physics Reasoner: Knowledge-Augmented Reasoning for Solving Physics Problems with Large Language Models |
Physics Reasoner:提出知识增强框架,利用大语言模型解决物理问题 |
large language model |
|
|
| 8 |
Federated Learning and RAG Integration: A Scalable Approach for Medical Large Language Models |
提出联邦学习与RAG集成的医学LLM方案,提升性能并保护隐私 |
large language model |
|
|
| 9 |
Large Language Models for Automated Literature Review: An Evaluation of Reference Generation, Abstract Writing, and Review Composition |
评估大型语言模型在文献综述自动化中的能力:参考文献生成、摘要撰写和综述构建 |
large language model |
|
|
| 10 |
A Statistical and Multi-Perspective Revisiting of the Membership Inference Attack in Large Language Models |
针对大型语言模型成员推断攻击,提出一种统计和多视角分析方法,揭示其性能不一致性。 |
large language model |
|
|
| 11 |
Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence |
提出视觉感知头差异性度量与强化方法,缓解LVLM中的幻觉问题 |
large language model multimodal |
|
|
| 12 |
Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs |
提出Multi-OphthaLingua多语言眼科QA基准,并设计CLARA方法缓解LLM在低收入国家应用的偏见问题。 |
large language model chain-of-thought |
|
|
| 13 |
Pipeline Analysis for Developing Instruct LLMs in Low-Resource Languages: A Case Study on Basque |
针对低资源语言巴斯克语,提出Instruct LLM开发流程分析与优化方案 |
large language model instruction following |
|
|
| 14 |
GAMEBoT: Transparent Assessment of LLM Reasoning in Games |
提出GAMEBoT以解决LLM推理评估透明性不足问题 |
large language model chain-of-thought |
✅ |
|
| 15 |
Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining |
提出两阶段预训练方法,优化数据选择与混合策略,提升LLM准确率。 |
large language model |
|
|
| 16 |
Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment |
提出DAHSF算法,结合文本归一化与语义解析,适用于特定场景和轻量级部署。 |
large language model |
|
|
| 17 |
Channel Merging: Preserving Specialization for Merged Experts |
提出通道合并方法,在合并专家模型时保持专业化知识并提升存储效率。 |
large language model |
|
|
| 18 |
Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings |
提出AutoDoS,一种黑盒LLM-DoS攻击方法,通过自动生成资源消耗型Prompt实现拒绝服务。 |
large language model |
✅ |
|
| 19 |
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling |
提出ECG-Byte,用于心电图语言建模的端到端Tokenizer,提升训练效率和可解释性。 |
large language model |
|
|
| 20 |
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks |
TheAgentCompany:构建基准测试,评估LLM智能体在真实世界任务中的表现 |
large language model |
|
|
| 21 |
Towards an optimised evaluation of teachers' discourse: The case of engaging messages |
提出一种优化教师话语评估方法,利用大型语言模型识别课堂互动信息。 |
large language model |
|
|
| 22 |
FarExStance: Explainable Stance Detection for Farsi |
FarExStance:提出用于波斯语的可解释立场检测数据集与基线模型 |
large language model |
|
|
| 23 |
A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI |
利用LLM生成解释作为人类解释的替代,用于自然语言推理中标签分布的收集 |
large language model |
|
|
| 24 |
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali |
针对尼泊尔语等低资源任务,提出领域自适应持续学习方法 |
large language model |
|
|
| 25 |
Meta-Reflection: A Feedback-Free Reflection Learning Framework |
提出Meta-Reflection,一种无需反馈的自反思学习框架,提升LLM在电商意图识别等任务中的性能。 |
large language model |
|
|
| 26 |
Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation |
评估大型语言模型在生成个性化虚假信息方面的漏洞,揭示安全过滤器的失效。 |
large language model |
|
|
| 27 |
PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling |
PsyDT:利用LLM构建具有个性化咨询风格的心理咨询师数字孪生 |
large language model |
|
|
| 28 |
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning |
提出LIFT:通过长输入微调提升长文本理解能力 |
large language model |
|
|
| 29 |
EvoWiki: Evaluating LLMs on Evolving Knowledge |
EvoWiki:一个用于评估LLM在演化知识上表现的自动更新数据集 |
large language model |
|
|
| 30 |
Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation |
提出一种社会文化敏感的评估框架,用于评估基于LLM的内容审核能力。 |
large language model |
|
|
| 31 |
EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents |
EscapeBench:提升语言模型智能体创造性智能的密室逃脱基准 |
chain-of-thought |
|
|
| 32 |
MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained with Simple Rules |
MetaRuleGPT:通过学习简单规则提升语言模型递归数值推理能力 |
large language model |
|
|
| 33 |
Lightweight Safety Classification Using Pruned Language Models |
提出层增强分类(LEC),利用剪枝语言模型实现轻量级安全内容分类与提示注入检测。 |
large language model |
|
|