| 1 |
Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models |
思维链推理提升大语言模型上下文感知翻译能力 |
large language model chain-of-thought |
|
|
| 2 |
BenCao: An Instruction-Tuned Large Language Model for Traditional Chinese Medicine |
BenCao:一个面向中医的指令调优大型语言模型,实现多模态融合与专家知识对齐 |
large language model multimodal chain-of-thought |
|
|
| 3 |
StreamingThinker: Large Language Models Can Think While Reading |
提出StreamingThinker,使LLM在阅读时同步推理,降低延迟并提升动态场景性能。 |
large language model chain-of-thought |
✅ |
|
| 4 |
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models |
提出全局迭代结构化剪枝GISP,提升大语言模型下游任务性能 |
large language model |
|
|
| 5 |
SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors |
SimBench:构建大规模基准测试,评估大语言模型模拟人类行为的能力 |
large language model |
|
|
| 6 |
Select-Then-Decompose: From Empirical Analysis to Adaptive Selection Strategy for Task Decomposition in Large Language Models |
提出Select-Then-Decompose策略,自适应选择任务分解方法,优化大语言模型性能与成本。 |
large language model |
✅ |
|
| 7 |
CMT-Bench: Cricket Multi-Table Generation Benchmark for Probing Robustness in Large Language Models |
CMT-Bench:板球多表格生成基准,用于评估大语言模型的鲁棒性 |
large language model |
|
|
| 8 |
Qomhra: A Bilingual Irish and English Large Language Model |
Qomhrá:一种爱尔兰语-英语双语大语言模型,解决低资源语言LLM的构建问题。 |
large language model |
|
|
| 9 |
Forget to Know, Remember to Use: Context-Aware Unlearning for Large Language Models |
提出上下文感知遗忘学习方法,提升大语言模型遗忘特定知识后的可用性 |
large language model |
|
|
| 10 |
Evaluating Large Language Models on Urdu Idiom Translation |
构建乌尔都语成语翻译数据集,评估大型语言模型在低资源语言上的表现 |
large language model |
|
|
| 11 |
Explainability of Large Language Models: Opportunities and Challenges toward Generating Trustworthy Explanations |
综述Transformer大语言模型可解释性,探讨医疗和自动驾驶领域应用及挑战。 |
large language model |
|
|
| 12 |
VERA-V: Variational Inference Framework for Jailbreaking Vision-Language Models |
VERA-V:基于变分推断的框架,用于破解视觉-语言模型的防御机制 |
large language model multimodal |
|
|
| 13 |
Deep Self-Evolving Reasoning |
提出深度自进化推理(DSER),提升小规模开放权重模型在复杂推理任务上的性能。 |
large language model chain-of-thought |
|
|
| 14 |
The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives |
揭示指令调优大模型在执行简单指令时存在的原子指令差距 |
large language model instruction following |
|
|
| 15 |
What Makes AI Research Replicable? Executable Knowledge Graphs as Scientific Knowledge Representations |
提出Executable Knowledge Graphs (xKG),提升AI研究的可复现性。 |
large language model |
✅ |
|
| 16 |
AcademicEval: Live Long-Context LLM Benchmark |
提出AcademicEval,一个长文本LLM的实时评测基准,解决现有基准的局限性。 |
large language model |
✅ |
|
| 17 |
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation |
提出Nyx,用于通用检索增强生成中的混合模态检索,提升视觉语言任务性能。 |
large language model |
|
|
| 18 |
Automatic Prompt Generation via Adaptive Selection of Prompting Techniques |
提出自适应提示技术选择方法,自动生成高质量提示,提升LLM性能。 |
large language model |
|
|
| 19 |
Language Models as Semantic Augmenters for Sequential Recommenders |
提出LaMAR框架,利用LLM进行语义增强,提升序列推荐模型性能。 |
large language model |
|
|
| 20 |
Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution |
提出基于回译的STEAM方法,提升多语言LLM水印在低资源语言下的鲁棒性 |
large language model |
|
|
| 21 |
Believe It or Not: How Deeply do LLMs Believe Implanted Facts? |
提出信念深度评估框架以验证知识编辑技术的有效性 |
large language model |
|
|
| 22 |
AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM |
AtlasKV:利用20GB VRAM,通过十亿级知识图谱增强大型语言模型 |
large language model |
|
|
| 23 |
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages |
AfriCaption:提出非洲语言图像描述新框架,促进多模态AI的公平发展。 |
multimodal |
|
|
| 24 |
EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs |
EduAdapt:构建用于评估LLM年级适应性的问答基准数据集 |
large language model |
✅ |
|
| 25 |
Efficient Toxicity Detection in Gaming Chats: A Comparative Study of Embeddings, Fine-Tuned Transformers and LLMs |
对比嵌入、微调Transformer与LLM,高效检测游戏聊天中的有害言论 |
large language model |
|
|
| 26 |
Does Reasoning Help LLM Agents Play Dungeons and Dragons? A Prompt Engineering Experiment |
利用LLM推理能力生成《龙与地下城》游戏指令:提示工程实验 |
large language model |
|
|
| 27 |
LexChain: Modeling Legal Reasoning Chains for Chinese Tort Case Analysis |
提出LexChain框架,显式建模中国侵权案件分析中的法律推理链 |
large language model |
|
|
| 28 |
Annotation-Efficient Universal Honesty Alignment |
提出EliCal框架以实现高效的诚实对齐 |
large language model |
|
|
| 29 |
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents |
综述LLM驱动的工业Agent技术、实践与评估,赋能真实世界应用 |
large language model |
|
|
| 30 |
ReXMoE: Reusing Experts with Minimal Overhead in Mixture-of-Experts |
ReXMoE:通过复用专家,以最小开销提升混合专家模型的性能。 |
large language model |
|
|
| 31 |
Disparities in Multilingual LLM-Based Healthcare Q&A |
揭示多语言LLM在医疗问答中存在的语言差异,并提出缓解策略 |
large language model |
|
|
| 32 |
Wisdom is Knowing What not to Say: Hallucination-Free LLMs Unlearning via Attention Shifting |
提出Attention-Shifting框架,实现LLM在知识密集型应用中无幻觉的定向遗忘。 |
large language model |
|
|
| 33 |
Verification-Aware Planning for Multi-Agent Systems |
VeriMAP:面向多智能体系统的验证感知规划框架,提升协作可靠性 |
large language model |
|
|
| 34 |
JT-Safe: Intrinsically Enhancing the Safety and Trustworthiness of LLMs |
JT-Safe:通过增强预训练数据中的世界知识提升LLM的安全性和可信度 |
large language model |
|
|