| 1 |
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization |
AdaMMS:面向异构多模态大语言模型的无监督系数优化模型融合 |
large language model multimodal |
|
|
| 2 |
Model Hemorrhage and the Robustness Limits of Large Language Models |
研究LLM的“模型出血”现象,提出缓解策略以提升模型部署时的鲁棒性。 |
large language model |
|
|
| 3 |
Does "Reasoning" with Large Language Models Improve Recognizing, Generating, and Reframing Unhelpful Thoughts? |
利用LLM推理能力改进认知行为疗法中不良想法的识别、生成与重构 |
large language model |
|
|
| 4 |
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models |
综述:探索大语言模型中的高效推理经济性 |
large language model |
|
|
| 5 |
BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models |
BEATS:用于评估大型语言模型偏见、伦理、公平性和事实性的测试套件 |
large language model |
|
|
| 6 |
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well? |
对大语言模型测试时扩展(TTS)进行全面综述,分析其原理、方法、应用及效果。 |
large language model |
✅ |
|
| 7 |
Evaluating the Feasibility and Accuracy of Large Language Models for Medical History-Taking in Obstetrics and Gynecology |
评估大型语言模型在妇产科病史采集中的可行性和准确性 |
large language model |
|
|
| 8 |
Integrating Large Language Models with Human Expertise for Disease Detection in Electronic Health Records |
利用大型语言模型与人类专业知识相结合,提升电子病历中疾病检测的准确性 |
large language model |
|
|
| 9 |
Do Large Language Models Exhibit Spontaneous Rational Deception? |
大型语言模型在特定情境下会自发进行理性欺骗 |
large language model |
|
|
| 10 |
Text Chunking for Document Classification for Urban System Management using Large Language Models |
提出基于文本分块的LLM方案,用于城市系统管理文档分类,性能媲美人工。 |
large language model |
|
|
| 11 |
Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation |
提出基于知识图谱和检索增强生成(KG-RAG)的框架,提升大型语言模型在电信领域的性能。 |
large language model |
|
|
| 12 |
BeMERC: Behavior-Aware MLLM-based Framework for Multimodal Emotion Recognition in Conversation |
BeMERC:基于行为感知的MLLM对话多模态情感识别框架 |
multimodal |
|
|
| 13 |
Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models |
利用开源大语言模型和人工指令构建高质量指令微调数据集 |
large language model |
|
|
| 14 |
Mapping Geopolitical Bias in 11 Large Language Models: A Bilingual, Dual-Framing Analysis of U.S.-China Tensions |
双语双框架分析揭示11个大型语言模型在美国-中国议题上的地缘政治偏见。 |
large language model |
|
|
| 15 |
Large Language Models Pass the Turing Test |
GPT-4.5通过图灵测试,首次证实大型语言模型具备人类级别的对话能力 |
large language model |
|
|
| 16 |
Texture or Semantics? Vision-Language Models Get Lost in Font Recognition |
揭示视觉-语言模型在字体识别中易受纹理干扰,语义理解能力不足 |
multimodal chain-of-thought |
|
|
| 17 |
Contradiction Detection in RAG Systems: Evaluating LLMs as Context Validators for Improved Information Consistency |
提出RAG系统矛盾检测框架,评估LLM作为上下文验证器的信息一致性能力 |
large language model chain-of-thought |
|
|
| 18 |
TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection |
提出TeleAntiFraud-28k:一个用于电信诈骗检测的音频-文本慢思考数据集。 |
large language model multimodal |
✅ |
|
| 19 |
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B |
通过因果干预揭示Gemma-2 2B模型上下文学习的“语境化-聚合”机制 |
large language model |
|
|
| 20 |
A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG |
系统评估LLM在心理健康文本分析中的策略:微调、提示工程与RAG |
large language model |
|
|
| 21 |
Artificial Conversations, Real Results: Fostering Language Detection with Synthetic Data |
利用合成数据促进语言检测:一种基于LLM的意大利语包容性语言检测方法 |
large language model |
|
|
| 22 |
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers |
提出SciReplicate-Bench以评估LLMs在算法重现中的表现 |
large language model |
✅ |
|
| 23 |
Synthesizing Public Opinions with LLMs: Role Creation, Impacts, and the Future to eDemorcacy |
利用大型语言模型合成公众意见,解决传统调查方法中的偏差问题。 |
large language model |
|
|
| 24 |
Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement |
提出动态参数检索增强生成(DyPRAG)框架,解决测试时知识增强的效率与泛化问题。 |
large language model |
✅ |
|
| 25 |
Adaptive Layer-skipping in Pre-trained LLMs |
FlexiDepth:一种预训练LLM的自适应层跳跃方法,在Llama-3-8B上实现显著加速。 |
large language model |
|
|
| 26 |
LANID: LLM-assisted New Intent Discovery |
LANID:利用LLM辅助轻量级编码器进行新意图发现 |
large language model |
✅ |
|
| 27 |
Entropy-Based Adaptive Weighting for Self-Training |
提出基于熵的自训练自适应加权方法EAST,提升大语言模型数学问题求解能力。 |
large language model |
|
|
| 28 |
Did ChatGPT or Copilot use alter the style of internet news headlines? A time series regression analysis |
研究表明,ChatGPT和Copilot的发布对互联网新闻标题风格的影响有限。 |
large language model |
|
|