| 1 |
Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench |
提出MLLMU-Bench基准,用于评估和提升多模态大语言模型的隐私保护能力。 |
large language model multimodal |
|
|
| 2 |
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications |
针对工业应用,提出基于多模态输入的RAG优化方法,提升问答性能。 |
large language model multimodal |
|
|
| 3 |
Enhancing Adversarial Attacks through Chain of Thought |
提出基于思维链的GCG对抗攻击方法,提升LLM对抗攻击的迁移性和通用性 |
large language model chain-of-thought |
✅ |
|
| 4 |
Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents |
Auto-Intent:无需微调,自动发现意图并自探索的大语言模型Web Agent |
large language model |
|
|
| 5 |
A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models |
提出基于心理测量学的LLM专业能力基准评估方法,应用于教育领域。 |
large language model |
|
|
| 6 |
Do Large Language Models Align with Core Mental Health Counseling Competencies? |
CounselingBench:评估大型语言模型在心理健康咨询能力上的表现 |
large language model |
✅ |
|
| 7 |
Anticipating Future with Large Language Model for Simultaneous Machine Translation |
提出TAF:利用大语言模型预测未来词汇,提升同步机器翻译质量。 |
large language model |
✅ |
|
| 8 |
MIMIC-IV-Ext-PE: Using a large language model to predict pulmonary embolism phenotype in the MIMIC-IV dataset |
利用大型语言模型在MIMIC-IV数据集上预测肺栓塞表型 |
large language model |
|
|
| 9 |
Multimodal Quantum Natural Language Processing: A Novel Framework for using Quantum Methods to Analyse Real Data |
提出多模态量子自然语言处理框架,利用量子方法分析真实数据中的语言组合性。 |
multimodal |
|
|
| 10 |
ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding |
提出ProMQA多模态问答数据集,用于评估程序性活动理解能力。 |
multimodal |
|
|
| 11 |
Improving Math Problem Solving in Large Language Models Through Categorization and Strategy Tailoring |
提出基于分类和策略定制的方法,提升大型语言模型在数学问题求解中的能力 |
large language model |
|
|
| 12 |
Linear Chain Transformation: Expanding Optimization Dynamics for Fine-Tuning Large Language Models |
LinChain:通过线性链变换扩展优化动态,提升大语言模型微调性能 |
large language model |
|
|
| 13 |
Personalization of Large Language Models: A Survey |
综述:大型语言模型个性化研究,填补文本生成与下游应用间的空白。 |
large language model |
|
|
| 14 |
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types |
SG-Bench:提出一个综合性评测基准,评估LLM在不同任务和提示类型下的安全性泛化能力。 |
large language model chain-of-thought |
|
|
| 15 |
Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models |
提出ATLAS方法,通过注意力机制干预缓解大语言模型中的偏见问题 |
large language model |
|
|
| 16 |
Scaling LLM Inference with Optimized Sample Compute Allocation |
提出OSCA算法,通过优化采样计算分配显著提升大语言模型推理效率。 |
large language model |
✅ |
|
| 17 |
Toxicity of the Commons: Curating Open-Source Pre-Training Data |
提出开放源代码数据过滤管道以减少有害输出 |
large language model |
|
|
| 18 |
DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers |
DISCERN:利用自然语言解释解码文本分类器中的系统性误差 |
large language model |
|
|
| 19 |
Benchmarking LLM Guardrails in Handling Multilingual Toxicity |
构建多语言毒性测试基准,评估LLM安全防护机制的有效性与鲁棒性 |
large language model |
|
|
| 20 |
The Impact of Inference Acceleration on Bias of LLMs |
推理加速优化可能显著且不可预测地改变LLM的偏见 |
large language model |
|
|
| 21 |
Distinguishing Ignorance from Error in LLM Hallucinations |
区分LLM幻觉中的无知与错误,提升幻觉检测与缓解效果 |
large language model |
✅ |
|
| 22 |
SceneGenAgent: Precise Industrial Scene Generation with Coding Agent |
SceneGenAgent:基于代码生成精确工业场景,解决LLM在工业场景应用的难题 |
large language model |
✅ |
|
| 23 |
Self-Preference Bias in LLM-as-a-Judge |
提出一种定量指标以评估LLM作为评估者时的自偏好偏差,揭示偏差源于对低困惑度文本的偏爱。 |
large language model |
|
|
| 24 |
Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach |
LINA:利用LLM进行假设演绎的神经符号逻辑推理方法 |
large language model |
✅ |
|
| 25 |
Learning and Unlearning of Fabricated Knowledge in Language Models |
研究语言模型中虚构知识的学习与遗忘,并提出多步稀疏更新方法缓解数据中毒。 |
large language model |
|
|
| 26 |
A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution |
提出基于贝叶斯方法的LLM作者身份识别框架,实现卓越的单样本分类精度 |
large language model |
|
|