| 1 |
Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs |
ICCA框架评估多模态LLM在对话中自发提升沟通效率的能力 |
large language model multimodal |
✅ |
|
| 2 |
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models |
提出MuChoMusic基准,用于评估多模态音频-语言模型在音乐理解方面的能力 |
multimodal |
|
|
| 3 |
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines |
研究超参数对大语言模型推理性能的影响,对比vLLM和HuggingFace Pipelines。 |
large language model |
|
|
| 4 |
Fairness in Large Language Models in Three Hours |
系统性探讨大语言模型中的公平性问题及解决方案 |
large language model |
✅ |
|
| 5 |
High-Throughput Phenotyping of Clinical Text Using Large Language Models |
利用大型语言模型实现临床文本的高通量表型分析自动化 |
large language model |
|
|
| 6 |
Coalitions of Large Language Models Increase the Robustness of AI Agents |
提出基于大语言模型联盟的AI Agent,提升鲁棒性并降低运营成本 |
large language model |
|
|
| 7 |
The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models |
提出RABBI指标,评估大语言模型在资源分配决策中的潜在偏差危害。 |
large language model |
|
|
| 8 |
Leveraging Encoder-only Large Language Models for Mobile App Review Feature Extraction |
利用Encoder-only大语言模型提升移动应用评论特征抽取性能 |
large language model |
|
|
| 9 |
Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting |
提出Prompt递归搜索框架,自适应调整LLM提示策略,提升复杂问题求解能力。 |
large language model chain-of-thought |
|
|
| 10 |
Evaluating the Impact of Advanced LLM Techniques on AI-Lecture Tutors for a Robotics Course |
评估LLM技术在机器人课程AI辅导中的应用效果 |
large language model |
|
|
| 11 |
Toward Automatic Relevance Judgment using Vision--Language Models for Image--Text Retrieval Evaluation |
利用视觉-语言模型提升图像-文本检索的相关性判断 |
large language model |
|
|
| 12 |
Deep Learning based Visually Rich Document Content Understanding: A Survey |
综述:基于深度学习的富视觉文档内容理解方法 |
multimodal |
|
|
| 13 |
Misinforming LLMs: vulnerabilities, challenges and opportunities |
揭示大型语言模型脆弱性:信息误导漏洞、挑战与机遇 |
large language model |
|
|
| 14 |
PERSOMA: PERsonalized SOft ProMpt Adapter Architecture for Personalized Language Prompting |
PERSOMA:个性化软提示适配器架构,有效捕捉用户历史交互信息,提升个性化语言提示效果。 |
large language model |
|
|
| 15 |
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts |
MoDE:一种基于混合Dyadic专家的高效多任务参数高效微调方法 |
large language model |
|
|
| 16 |
Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks |
针对计算社会科学任务,研究LLM微调与提示工程的最佳实践 |
large language model |
|
|
| 17 |
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only |
FANNO:仅用开源LLM增强高质量指令数据,无需人工标注或昂贵API。 |
large language model |
|
|
| 18 |
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework |
RAGEval:提出场景特定的RAG评估数据集生成框架,解决RAG系统在特定场景下评估难题。 |
large language model |
✅ |
|
| 19 |
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs |
提出CFBench,一个全面的约束遵循基准,用于评估大型语言模型。 |
large language model |
✅ |
|
| 20 |
Task Prompt Vectors: Effective Initialization through Multi-Task Soft-Prompt Transfer |
提出任务提示向量,通过多任务软提示迁移实现高效初始化。 |
large language model |
|
|
| 21 |
IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection |
IAI团队在CheckThat! 2024中利用Transformer模型和数据增强进行可信声明检测 |
chain-of-thought |
|
|
| 22 |
Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs |
提出BridgeKG数据集,利用大语言模型弥合对话系统中知识图谱的信息鸿沟 |
large language model |
|
|
| 23 |
Automatic Extraction of Relationships among Motivations, Emotions and Actions from Natural Language Texts |
提出一种基于图的框架,利用自然语言文本自动提取动机、情感和行为之间的关系。 |
large language model |
|
|