| 1 |
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks |
提出MIRAGE框架,利用多模态沉浸式推理和引导探索实现对MLLM的红队越狱攻击。 |
large language model multimodal |
|
|
| 2 |
A Survey of Large Language Model Agents for Question Answering |
综述:基于大型语言模型Agent的问答系统研究 |
large language model |
|
|
| 3 |
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders |
利用稀疏自编码器解析大型语言模型中的推理特征,揭示其内部推理机制。 |
large language model |
✅ |
|
| 4 |
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models |
提出Commander-GPT框架,无需微调即可显著提升多模态大语言模型在讽刺检测任务上的性能。 |
large language model |
|
|
| 5 |
Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models |
评估大型语言模型在胃肠病学问题上的自信度与准确性 |
large language model |
|
|
| 6 |
MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering |
MAGIC-VQA:融合常识知识的多模态视觉问答框架 |
multimodal |
|
|
| 7 |
J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain |
提出J&H框架,评估大语言模型在法律领域知识注入攻击下的鲁棒性 |
large language model |
|
|
| 8 |
Surgical Action Planning with Large Language Models |
提出LLM-SAP框架,利用大语言模型进行机器人辅助手术中的动作规划 |
large language model |
|
|
| 9 |
TIB-STC: A Large-Scale Structured Tibetan Benchmark for Low-Resource Language Modeling |
构建大规模结构化藏语基准数据集TIB-STC,促进低资源语言建模 |
large language model instruction following |
✅ |
|
| 10 |
Overtrained Language Models Are Harder to Fine-Tune |
揭示大语言模型过度训练导致微调困难的“灾难性过度训练”现象 |
large language model |
|
|
| 11 |
Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education |
评估LLM在职位-简历匹配中的偏见:性别、种族与教育 |
large language model |
|
|
| 12 |
Language Model Uncertainty Quantification with Attention Chain |
提出UQAC方法,通过注意力链高效量化LLM在复杂推理中的不确定性 |
large language model |
|
|
| 13 |
Understanding and Improving Information Preservation in Prompt Compression for LLMs |
提出Prompt压缩评估框架并改进软提示方法,提升LLM信息保持能力与下游任务性能。 |
large language model |
✅ |
|
| 14 |
Masks and Mimicry: Strategic Obfuscation and Impersonation Attacks on Authorship Verification |
利用LLM的混淆和模仿攻击评估作者身份验证模型的对抗鲁棒性 |
large language model |
|
|
| 15 |
LLM-Based Insight Extraction for Contact Center Analytics and Cost-Efficient Deployment |
提出基于LLM的联络中心洞察提取系统,实现低成本呼叫分析与智能应用。 |
large language model |
|
|
| 16 |
LookAhead Tuning: Safer Language Models via Partial Answer Previews |
LookAhead Tuning:通过预览部分答案前缀,提升微调后语言模型的安全性 |
large language model |
|
|
| 17 |
Exploring Training and Inference Scaling Laws in Generative Retrieval |
探索生成式检索中的训练和推理扩展定律,揭示模型规模、数据规模和计算资源对性能的影响。 |
large language model |
|
|
| 18 |
xKV: Cross-Layer SVD for KV-Cache Compression |
xKV:通过跨层奇异值分解压缩KV缓存,提升长文本LLM推理效率。 |
large language model |
✅ |
|
| 19 |
AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration |
提出AgentDropout以解决多智能体系统中的低通信效率问题 |
large language model |
✅ |
|
| 20 |
LANGALIGN: Enhancing Non-English Language Models via Cross-Lingual Embedding Alignment |
LANGALIGN:通过跨语言嵌入对齐增强非英语语言模型 |
large language model |
|
|
| 21 |
Autoregressive Language Models for Knowledge Base Population: A case study in the space mission domain |
提出基于自回归语言模型的知识库填充方法,应用于航天任务领域。 |
large language model |
|
|
| 22 |
Fact-checking AI-generated news reports: Can LLMs catch their own lies? |
评估大语言模型对自身生成新闻报告的真伪辨别能力,揭示其局限性。 |
large language model |
|
|
| 23 |
Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages |
增强埃塞俄比亚语多标签情感分析,引入情感强度建模。 |
large language model |
|
|