| 1 |
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps |
提出MiCEval,通过图像描述和推理步骤评估多模态CoT的质量。 |
large language model multimodal chain-of-thought |
✅ |
|
| 2 |
Few-Shot Joint Multimodal Entity-Relation Extraction via Knowledge-Enhanced Cross-modal Prompt Model |
提出知识增强跨模态Prompt模型,解决少样本多模态实体关系联合抽取问题 |
large language model multimodal |
|
|
| 3 |
Supervised Chain of Thought |
提出监督式思维链(Supervised CoT)方法,提升LLM在复杂推理任务中的性能 |
large language model chain-of-thought |
|
|
| 4 |
SPRIG: Improving Large Language Model Performance by System Prompt Optimization |
SPRIG:通过系统提示优化提升大型语言模型性能 |
large language model |
|
|
| 5 |
Large Language Models Are Overparameterized Text Encoders |
通过层剪枝,显著降低大语言模型文本编码的参数冗余,提升推理效率。 |
large language model |
|
|
| 6 |
DFlow: Diverse Dialogue Flow Simulation with Large Language Models |
DFlow:利用大语言模型进行多样化对话流模拟,提升任务型对话数据质量。 |
large language model |
|
|
| 7 |
REEF: Representation Encoding Fingerprints for Large Language Models |
提出REEF:一种免训练的大语言模型表征编码指纹方法,用于知识产权保护。 |
large language model |
✅ |
|
| 8 |
A Lightweight Multi Aspect Controlled Text Generation Solution For Large Language Models |
提出一种轻量级数据增强方案,提升大语言模型在多属性控制文本生成任务上的性能。 |
large language model |
|
|
| 9 |
SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic Reasoning |
提出SylloBio-NLI框架,评估大语言模型在生物医学三段论推理中的能力。 |
large language model |
|
|
| 10 |
Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning |
提出Paths-over-Graph (PoG),通过知识图谱路径增强LLM推理能力,解决幻觉和知识不足问题。 |
large language model |
|
|
| 11 |
Automated Genre-Aware Article Scoring and Feedback Using Large Language Models |
提出一种基于BERT和ChatGPT的自动文章评分与反馈系统,提升文章质量评估的准确性和个性化。 |
large language model |
|
|
| 12 |
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models |
提出因果信实度度量以提升自然语言解释的可信度 |
large language model |
✅ |
|
| 13 |
CAPE: A Chinese Dataset for Appraisal-based Emotional Generation using Large Language Models |
CAPE:一个基于认知评估理论的中文情感生成数据集,利用大语言模型提升对话情感表达。 |
large language model |
|
|
| 14 |
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems |
提出MultiChartQA基准,评估视觉语言模型在多图表推理中的能力 |
large language model multimodal |
✅ |
|
| 15 |
MoDification: Mixture of Depths Made Easy |
MoDification:一种简易的深度混合方法,提升长文本LLM效率 |
large language model |
|
|
| 16 |
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs |
提出LabSafety Bench,评估LLM在科学实验室安全问题上的表现 |
large language model |
|
|
| 17 |
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment |
提出SudoLM,通过授权对齐实现LLM参数知识的访问控制。 |
large language model |
|
|
| 18 |
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs |
LongPiBench:揭示长文本LLM中相关信息间距引起的偏差问题 |
large language model |
|
|
| 19 |
Teaching Models to Balance Resisting and Accepting Persuasion |
提出Persuasion-Training (PBT)方法,提升LLM在对抗性说服中的抵抗力与接受有益说服的能力。 |
large language model |
|
|
| 20 |
Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting Summarization |
提出基于LLM的三阶段方法,实现个性化多源会议摘要生成 |
large language model |
|
|
| 21 |
Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs |
提出基于理由的多维度作文评分方法RMTS,利用LLM提升S-LLM的评分性能。 |
large language model |
✅ |
|
| 22 |
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts |
针对AI生成文本检测数据集质量评估,提出系统性评测方法以提升检测器泛化能力。 |
large language model |
✅ |
|
| 23 |
Combining Entropy and Matrix Nuclear Norm for Enhanced Evaluation of Language Models |
结合熵与矩阵核范数,提出一种增强型语言模型评估方法 |
large language model |
|
|
| 24 |
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference |
系统性研究跨层KV共享,提升LLM推理效率 |
large language model |
|
|
| 25 |
How Do Multilingual Language Models Remember Facts? |
揭示多语言LLM事实记忆机制:语言依赖与独立性分析 |
large language model |
|
|
| 26 |
Critical Questions Generation: Motivation and Challenges |
提出关键问题生成任务,利用LLM质疑论证,缓解其知识过时和幻觉问题。 |
large language model |
|
|
| 27 |
EcomEdit: An Automated E-commerce Knowledge Editing Framework for Enhanced Product and Purchase Intention Understanding |
提出ECOMEDIT,一个电商知识编辑框架,提升LLM对商品和购买意图的理解。 |
large language model |
|
|
| 28 |
Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation |
多智能体LLM协同通过“好家长”式反馈缓解幻觉问题 |
large language model |
|
|
| 29 |
Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement |
提出LLMDetect基准,通过角色识别和参与度测量实现细粒度的LLM生成文本检测。 |
large language model |
|
|
| 30 |
Towards Robust Knowledge Representations in Multilingual LLMs for Equivalence and Inheritance based Consistent Reasoning |
提出组合表示方法,提升多语言LLM在等价性和继承性推理任务中的一致性。 |
large language model |
|
|
| 31 |
XForecast: Evaluating Natural Language Explanations for Time Series Forecasting |
XForecast提出基于可模拟性的指标,评估时间序列预测自然语言解释的质量。 |
large language model |
|
|
| 32 |
SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-based Agent |
SRAP-Agent:利用LLM模拟和优化稀缺资源分配策略 |
large language model |
✅ |
|