| 1 |
Empowering Persian LLMs for Instruction Following: A Novel Dataset and Training Approach |
提出FarsInstruct波斯语指令数据集和Co-CoLA训练框架,提升波斯语LLM的指令遵循能力。 |
large language model instruction following |
|
|
| 2 |
Bridging Sequence-Structure Alignment in RNA Foundation Models |
OmniGenome:提出一种RNA基础模型,通过结构上下文建模对齐序列-结构,实现RNA序列和结构的双向映射。 |
foundation model |
|
|
| 3 |
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated |
Q-Sparse:实现大语言模型全稀疏激活,提升推理效率 |
large language model |
|
|
| 4 |
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models |
提出多语言互增强效应混合数据集MMM,并用于训练开放域信息抽取大语言模型OIELLM。 |
large language model |
✅ |
|
| 5 |
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases |
提出LLM偏见评估框架LangFair,针对特定用例评估模型公平性 |
large language model |
|
|
| 6 |
Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation |
提出FLAMe:通过训练基础大语言模型提升自动评估能力 |
large language model |
|
|
| 7 |
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation |
提出Think-on-Graph 2.0,通过知识图谱引导的检索增强生成实现深度LLM推理。 |
large language model |
✅ |
|
| 8 |
Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education |
Graphusion:利用大语言模型融合科学知识图谱,应用于NLP教育 |
large language model |
|
|
| 9 |
Evaluating Large Language Models with fmeval |
fmeval:一个用于评估大型语言模型性能和负责任AI维度的开源库 |
large language model |
✅ |
|
| 10 |
Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models |
针对社会科学文本标注,提出基于自动Prompt优化的LLM标注方法,显著提升标注精度。 |
large language model |
|
|
| 11 |
MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation |
MetaTool:通过元任务增强提升大语言模型工具使用能力 |
large language model |
|
|
| 12 |
TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction |
TCM-FTP:通过微调大型语言模型进行中药处方预测 |
large language model |
|
|
| 13 |
Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping |
提出基于语言无关层跳跃的多语言对比解码方法,提升LLM在多语言推理任务中的性能。 |
large language model chain-of-thought |
✅ |
|
| 14 |
Qwen2 Technical Report |
Qwen2系列发布:开源0.5B-72B参数规模语言模型,性能超越现有开源模型 |
large language model multimodal |
|
|
| 15 |
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework |
GraphEval:一种基于知识图谱的LLM幻觉评估框架 |
large language model |
|
|
| 16 |
Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts |
提出Codebook-LLM框架,评估LLM在政治科学概念测量中的应用,并提供改进指导。 |
large language model |
|
|
| 17 |
CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses |
CLAVE:一种自适应框架,用于评估LLM生成响应的价值观 |
large language model |
|
|
| 18 |
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems |
提出DocBench:用于评估基于LLM的文档阅读系统的基准 |
large language model |
|
|
| 19 |
Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems |
提出ERM4框架,通过四个模块协同优化RAG系统的质量和效率。 |
large language model |
✅ |
|
| 20 |
An Empirical Study of Validating Synthetic Data for Formula Generation |
通过验证合成数据提升公式生成模型性能 |
large language model |
|
|
| 21 |
Beyond Generative Artificial Intelligence: Roadmap for Natural Language Generation |
自然语言生成发展路线图:应对大型语言模型时代的新挑战 |
large language model |
|
|
| 22 |
How and where does CLIP process negation? |
深入剖析CLIP如何处理否定概念,揭示多模态模型内部机制 |
multimodal |
|
|
| 23 |
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism |
关注LLM非确定性:揭示贪婪解码与采样策略的性能差异及影响 |
large language model |
|
|