| 1 |
ChatEXAONEPath: An Expert-level Multimodal Large Language Model for Histopathology Using Whole Slide Images |
ChatEXAONEPath:一种用于组织病理学WSI的专家级多模态大语言模型 |
large language model multimodal |
|
|
| 2 |
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration |
提出GeoGen与GeoLogic,提升多模态LLM在几何问题求解中的能力 |
large language model multimodal |
✅ |
|
| 3 |
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning |
GeoSense:提出用于评估多模态LLM几何推理能力的新基准 |
large language model multimodal |
|
|
| 4 |
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo |
提出基于序列蒙特卡洛的语言模型控制框架,提升在语法和语义约束下的文本生成性能。 |
large language model |
|
|
| 5 |
Are Retrials All You Need? Enhancing Large Language Model Reasoning Without Verbalized Feedback |
提出无需反馈的重试机制,提升大语言模型推理能力,降低计算成本。 |
large language model |
|
|
| 6 |
DIDS: Domain Impact-aware Data Sampling for Large Language Model Training |
DIDS:领域感知的数据采样方法,用于提升大语言模型训练效果 |
large language model |
✅ |
|
| 7 |
Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models |
通过干预先验分布提升大语言模型在确定性任务中的表现 |
large language model |
|
|
| 8 |
CPG-EVAL: A Multi-Tiered Benchmark for Evaluating the Chinese Pedagogical Grammar Competence of Large Language Models |
CPG-EVAL:用于评估大语言模型汉语教学语法能力的基准 |
large language model |
|
|
| 9 |
SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation |
提出基于因果分析和分层优化的选择性遗忘方法,提升LLM数据隐私保护能力。 |
large language model |
|
|
| 10 |
Benchmarking Multi-National Value Alignment for Large Language Models |
提出NaVAB基准,评估大型语言模型在多国价值观上的对齐程度 |
large language model |
|
|
| 11 |
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models |
提出信息增益引导的因果干预框架,用于自主消除大语言模型中的偏见 |
large language model |
|
|
| 12 |
Pandora: A Code-Driven Large Language Model Agent for Unified Reasoning Across Diverse Structured Knowledge |
Pandora:一种代码驱动的大语言模型Agent,用于统一推理多种结构化知识 |
large language model |
|
|
| 13 |
How Large Language Models Are Changing MOOC Essay Answers: A Comparison of Pre- and Post-LLM Responses |
分析LLM对MOOC论文作答的影响:对比ChatGPT前后学生提交的AI伦理论文 |
large language model |
|
|
| 14 |
ConExion: Concept Extraction with Large Language Models |
ConExion:利用大型语言模型进行概念抽取,提升领域覆盖评估和本体学习。 |
large language model |
✅ |
|
| 15 |
Chinese-Vicuna: A Chinese Instruction-following Llama-based Model |
Chinese-Vicuna:一种基于LLaMA的中文指令跟随模型,针对低资源环境。 |
instruction following |
|
|
| 16 |
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment |
提出Persona-judge,通过token级自判别实现大语言模型的个性化对齐。 |
large language model |
|
|
| 17 |
Deep literature reviews: an application of fine-tuned language models to migration research |
提出基于微调语言模型的深度文献综述框架,应用于人口迁移研究。 |
large language model |
|
|
| 18 |
Retrieval-Augmented Generation with Conflicting Evidence |
提出MADAM-RAG,解决检索增强生成中冲突证据带来的歧义、噪声和错误信息问题 |
large language model |
|
|
| 19 |
Accuracy is Not Agreement: Expert-Aligned Evaluation of Crash Narrative Classification Models |
揭示事故叙事分类中准确率与专家一致性的悖论,并探索LLM在安全关键任务中的应用潜力。 |
large language model |
|
|
| 20 |
Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation |
提出自注意力检索增强的方面级摘要生成框架,解决大模型token限制和幻觉问题。 |
large language model |
|
|
| 21 |
Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Robust Response Generation in the Wild |
提出Swin-VIB框架,解决检索增强LLM中知识冲突导致的响应不确定性问题 |
large language model |
|
|
| 22 |
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs |
ImPart:面向LLM的基于重要性的Delta稀疏化方法,提升模型压缩与合并效果 |
large language model |
|
|
| 23 |
Sparks of Science: Hypothesis Generation Using Structured Paper Data |
提出HypoGen数据集,用于训练模型生成更具创新性和可行性的科学假设。 |
foundation model |
|
|
| 24 |
MAIN: Mutual Alignment Is Necessary for instruction tuning |
提出MAIN框架,通过互对齐提升指令调优中指令-响应对的质量。 |
large language model |
|
|
| 25 |
ViClaim: A Multilingual Multilabel Dataset for Automatic Claim Detection in Videos |
ViClaim:一个用于视频自动声明检测的多语言多标签数据集 |
multimodal |
|
|
| 26 |
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks |
提出MLRBench,用于评估LLM在多语言长文本上的推理能力,超越简单检索。 |
large language model |
|
|
| 27 |
Assessing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation |
评估LLM在艺术语境下的表现:评论生成与心理理论能力评估 |
large language model |
|
|
| 28 |
Data-efficient LLM Fine-tuning for Code Generation |
提出数据选择与动态token打包策略,提升代码生成LLM微调效率与性能 |
large language model |
|
|
| 29 |
GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs |
GRAIL:基于梯度的自适应解学习框架,用于LLM中的隐私和版权保护 |
large language model |
|
|
| 30 |
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation |
提出基于分层合成数据生成的大语言模型长文本扩展方法,实现百万token上下文处理。 |
large language model |
|
|
| 31 |
SOLAR: Towards Characterizing Subjectivity of Individuals through Modeling Value Conflicts and Trade-offs |
SOLAR框架:通过建模价值冲突与权衡,刻画个体主观性 |
large language model |
|
|
| 32 |
CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation |
提出CDF-RAG,通过因果动态反馈增强RAG的推理能力和事实准确性 |
large language model |
|
|
| 33 |
ELAB: Extensive LLM Alignment Benchmark in Persian Language |
ELAB:波斯语大型语言模型对齐的综合基准评测 |
large language model |
✅ |
|