| 1 |
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models |
提出通用声学对抗攻击,控制语音基础模型行为 |
large language model foundation model |
|
|
| 2 |
ArAIEval Shared Task: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content |
ArAIEval 共享任务:提出阿拉伯语单模态和多模态内容中的宣传技巧检测方案 |
multimodal |
|
|
| 3 |
Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning |
Re-Tuning:通过递归调优克服大语言模型在组合性任务上的局限性 |
large language model |
|
|
| 4 |
Identifying the Source of Generation for Large Language Models |
提出双元语法源标识符,用于识别大型语言模型生成文本的来源。 |
large language model |
|
|
| 5 |
Crafting Large Language Models for Enhanced Interpretability |
提出概念瓶颈大语言模型(CB-LLM),实现固有可解释性并提升模型性能。 |
large language model |
|
|
| 6 |
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models |
ANAH-v2:提出迭代自训练框架,解决大语言模型分析性幻觉标注的规模化难题。 |
large language model |
|
|
| 7 |
VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models |
VRSD:面向大语言模型的检索,重新思考相似性和多样性 |
large language model |
|
|
| 8 |
Generalists vs. Specialists: Evaluating Large Language Models for Urdu |
对比通用与专用LLM在乌尔都语NLP任务上的性能,发现专用模型更优 |
large language model |
|
|
| 9 |
Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition |
提出一种基于人类相似性的视觉故事生成评估方法,揭示现有指标的局限性。 |
foundation model visual grounding |
|
|
| 10 |
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs |
提出情境感知数据集SAD,用于评估大型语言模型(LLMs)的自我认知能力。 |
large language model instruction following |
|
|
| 11 |
Automating Venture Capital: Founder assessment using LLM-powered segmentation, feature engineering and automated labeling techniques |
利用LLM进行创始人评估,助力风险投资决策自动化 |
large language model chain-of-thought |
|
|
| 12 |
PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts |
提出PoPreRo:一个用于预测罗马尼亚Reddit帖子受欢迎程度的新数据集 |
large language model |
✅ |
|
| 13 |
GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning |
对比研究PEFT在GPT和RETRO模型上的表现,揭示检索增强与参数高效微调的协同效应 |
large language model |
|
|
| 14 |
Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs |
提出面向用户意图的抽取式摘要数据集,提升LLM生成摘要的连贯性 |
large language model |
✅ |
|
| 15 |
Aligning Model Evaluations with Human Preferences: Mitigating Token Count Bias in Language Model Assessments |
提出一种校准方法,减轻语言模型评估中token数量偏差,提升与人类偏好对齐度 |
large language model |
|
|
| 16 |
Statistical investigations into the geometry and homology of random programs |
利用几何与拓扑统计分析随机程序,无需高维嵌入近似 |
large language model |
|
|
| 17 |
From 'Showgirls' to 'Performers': Fine-tuning with Gender-inclusive Language for Bias Reduction in LLMs |
通过性别包容性语言微调LLM,减少性别偏见 |
large language model |
|
|