| 1 |
ImF: Implicit Fingerprint for Large Language Models |
提出隐式指纹ImF,增强大语言模型知识产权保护,抵抗对抗攻击。 |
large language model chain-of-thought |
|
|
| 2 |
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models |
FLEX:一个评估大型语言模型公平性鲁棒性的基准测试 |
large language model |
|
|
| 3 |
DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models |
提出DeCAP,通过上下文自适应提示生成来消除大语言模型零样本问答中的偏见 |
large language model |
|
|
| 4 |
Bigger But Not Better: Small Neural Language Models Outperform Large Language Models in Detection of Thought Disorder |
小规模神经语言模型在检测思维障碍方面优于大型语言模型 |
large language model |
|
|
| 5 |
Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success |
探讨生成语言学在大型语言模型冲击下的发展策略:超越形式化,拥抱社会化 |
large language model |
|
|
| 6 |
Rosetta-PL: Propositional Logic as a Benchmark for Large Language Model Reasoning |
提出Rosetta-PL以评估大型语言模型的逻辑推理能力 |
large language model |
|
|
| 7 |
Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays |
大型语言模型在大学申请文书生成中存在对齐性和可控性问题 |
large language model |
|
|
| 8 |
1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training |
提出1.4百万开源蒸馏推理数据集以增强大型语言模型训练 |
large language model |
✅ |
|
| 9 |
PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping |
PHEONA:用于评估基于大语言模型计算表型方法的框架 |
large language model |
|
|
| 10 |
Linguistic Blind Spots of Large Language Models |
揭示大型语言模型在细粒度语言标注任务中的能力盲区 |
large language model |
|
|
| 11 |
DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts |
提出DomainCQA框架,用于构建领域知识密集型图表问答基准 |
large language model multimodal |
|
|
| 12 |
SCI-IDEA: Context-Aware Scientific Ideation Using Token and Sentence Embeddings |
SCI-IDEA:利用上下文感知的Token和句子嵌入进行科学构思 |
large language model chain-of-thought |
|
|
| 13 |
A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 |
对比LLM与传统NLP工具在1900-1950年历史中文文本处理中的性能,解决分词、词性标注和命名实体识别问题。 |
large language model |
|
|
| 14 |
HausaNLP at SemEval-2025 Task 3: Towards a Fine-Grained Model-Aware Hallucination Detection |
HausaNLP提出一种基于ModernBERT的微调方法,用于细粒度的模型感知幻觉检测。 |
large language model |
|
|
| 15 |
Scaling Laws of Synthetic Data for Language Models |
SynthLLM:通过图算法自动生成高质量合成数据,探索语言模型的可扩展性规律。 |
large language model |
|
|
| 16 |
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation |
CausalRAG:提出一种融合因果图的检索增强生成框架,提升知识密集型任务的准确性和可解释性。 |
large language model |
|
|
| 17 |
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators |
利用推理模型作为过程评估器,提升评估阶段的计算规模 |
chain-of-thought |
|
|
| 18 |
SemEval-2025 Task 9: The Food Hazard Detection Challenge |
SemEval-2025 Task 9提出基于长尾分布的食品危害检测挑战,并验证了合成数据和多种模型架构的有效性。 |
large language model |
|
|
| 19 |
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation |
AdaptiVocab:通过轻量级词汇表适配提升LLM在特定领域的效率 |
large language model |
|
|
| 20 |
Exploring Cultural Nuances in Emotion Perception Across 15 African Languages |
提出跨语言情感表达分析以解决非洲语言情感检测问题 |
multimodal |
|
|
| 21 |
Iterative Hypothesis Generation for Scientific Discovery with Monte Carlo Nash Equilibrium Self-Refining Trees |
提出MC-NEST框架,通过蒙特卡洛树搜索和纳什均衡迭代优化科学假设生成。 |
large language model |
|
|
| 22 |
MARS: Memory-Enhanced Agents with Reflective Self-improvement |
MARS:提出一种记忆增强的智能体框架,通过反思性自提升解决LLM在动态环境中长期记忆和决策问题。 |
large language model |
|
|