| 1 |
Measuring Stability Beyond Accuracy in Small Open-Source Medical Large Language Models for Pediatric Endocrinology |
评估儿科内分泌领域小型开源医学LLM的稳定性,超越传统准确率指标 |
large language model |
|
|
| 2 |
Cross-Platform Evaluation of Large Language Model Safety in Pediatric Consultations: Evolution of Adversarial Robustness and the Scale Paradox |
评估大语言模型在儿科咨询中的安全性,揭示对抗鲁棒性演变与规模悖论 |
large language model |
|
|
| 3 |
Bounded Hyperbolic Tangent: A Stable and Efficient Alternative to Pre-Layer Normalization in Large Language Models |
提出有界双曲正切(BHyT),提升大语言模型训练稳定性和效率,替代Pre-LN。 |
large language model |
|
|
| 4 |
TimeBill: Time-Budgeted Inference for Large Language Models |
TimeBill:面向大语言模型的时间预算推理框架,提升任务完成率和响应性能。 |
large language model |
|
|
| 5 |
Knowledge Reasoning of Large Language Models Integrating Graph-Structured Information for Pest and Disease Control in Tobacco |
提出融合图结构信息的大语言模型,用于烟草病虫害防治的知识推理 |
large language model |
|
|
| 6 |
Towards Efficient Post-Training via Fourier-Driven Adapter Architectures |
提出基于傅里叶变换的Adapter架构FAA,用于高效微调大型预训练语言模型。 |
large language model |
|
|
| 7 |
CricBench: A Multilingual Benchmark for Evaluating LLMs in Cricket Analytics |
CricBench:一个用于评估LLM在板球分析中性能的多语言基准测试 |
large language model |
|
|
| 8 |
Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content? |
评估大型视觉语言模型版权意识,提出工具增强防御框架以降低侵权风险 |
multimodal |
|
|
| 9 |
Context as a Tool: Context Management for Long-Horizon SWE-Agents |
提出CAT框架,通过可调用工具管理上下文,提升长程软件工程Agent性能。 |
large language model |
|
|
| 10 |
Broken Words, Broken Performance: Effect of Tokenization on Performance of LLMs |
研究表明:LLM分词方式影响性能,提出惩罚函数量化分词质量 |
large language model |
|
|
| 11 |
Method Decoration (DeMe): A Framework for LLM-Driven Adaptive Method Generation in Dynamic IoT Environments |
提出DeMe框架,利用LLM驱动IoT环境下的自适应方法生成 |
large language model |
|
|
| 12 |
On The Conceptualization and Societal Impact of Cross-Cultural Bias |
分析跨文化偏见文献,倡导语言技术社会影响评估 |
large language model |
|
|