| 1 |
From No to Know: Taxonomy, Challenges, and Opportunities for Negation Understanding in Multimodal Foundation Models |
提出多模态否定理解分类法,应对多模态大模型在否定语义理解上的挑战。 |
foundation model multimodal |
|
|
| 2 |
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation |
提出ProverGen框架,结合LLM与符号证明器生成高质量一阶逻辑推理数据集ProverQA。 |
large language model chain-of-thought |
✅ |
|
| 3 |
Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation |
提出结构化神经元封装,提升大语言模型信息聚合与逻辑推理能力 |
large language model |
|
|
| 4 |
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models |
提出多轮评估方法,用于衡量大型语言模型中拟人化行为的程度。 |
large language model |
|
|
| 5 |
Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations |
提出一种基于微调LLM的方法,用于模拟全球人口的调查响应分布。 |
large language model |
|
|
| 6 |
Demystifying Singular Defects in Large Language Models |
揭示大语言模型奇异缺陷:基于奇异向量分析高范数Token现象 |
large language model |
✅ |
|
| 7 |
Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations |
通过语言效能刺激提升大型语言模型的自我效能与表现 |
large language model |
|
|
| 8 |
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training |
Hephaestus:通过持续预训练提升大语言模型智能体的基础能力 |
large language model |
|
|
| 9 |
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks |
综述大型语言模型中的心理理论:评估、表征与安全风险 |
large language model |
|
|
| 10 |
Systematic Outliers in Large Language Models |
深入分析LLM中的系统性异常值,揭示其成因、功能及对模型的影响 |
large language model |
✅ |
|
| 11 |
Latent Convergence Modulation in Large Language Models: A Novel Approach to Iterative Contextual Realignment |
提出潜在收敛调制方法,提升大型语言模型长文本生成中的上下文一致性。 |
large language model |
|
|
| 12 |
DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models |
提出 DebateBench:一个用于评估大型语言模型长文本推理能力的挑战性基准 |
large language model |
|
|
| 13 |
GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing |
提出GuideLLM,探索LLM引导的对话在自传访谈中的应用 |
large language model instruction following |
|
|
| 14 |
Non-literal Understanding of Number Words by Language Models |
通过链式思考提示,提升大语言模型对数字词汇的非字面理解能力 |
large language model chain-of-thought |
|
|
| 15 |
ConMeC: A Dataset for Metonymy Resolution with Common Nouns |
ConMeC:一个用于普通名词转喻消解的数据集 |
large language model chain-of-thought |
✅ |
|
| 16 |
Cardiverse: Harnessing LLMs for Novel Card Game Prototyping |
Cardiverse:利用大型语言模型进行创新卡牌游戏原型设计 |
large language model |
✅ |
|
| 17 |
Tokenization Standards for Linguistic Integrity: Turkish as a Benchmark |
提出一种新框架以评估土耳其语的分词策略 |
large language model |
|
|
| 18 |
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements |
提出AIMS.au数据集,用于分析企业声明中现代奴隶制应对措施 |
large language model |
|
|
| 19 |
Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI |
利用LLM和可解释AI预测DIF,发现与DIF相关的词汇以提升评估公平性 |
large language model |
|
|
| 20 |
Investigating the Zone of Proximal Development of Language Models for In-Context Learning |
利用近端发展区理论分析LLM的上下文学习能力,提升推理和微调效果 |
large language model |
|
|
| 21 |
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling |
提出计算最优的测试时缩放策略,使小模型在复杂任务上超越大模型 |
large language model |
|
|
| 22 |
In-Context Learning (and Unlearning) of Length Biases |
研究表明大语言模型能通过上下文学习长度偏差,并可用于消除模型自身编码的长度偏差。 |
large language model |
|
|
| 23 |
Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A |
提出MultiRAIN对齐RAG系统,提升隐私问答中LLM的透明性和合规性 |
large language model |
|
|
| 24 |
Do we really have to filter out random noise in pre-training data for language models? |
研究表明预训练数据中的随机噪声对语言模型影响有限,并提出局部梯度匹配损失。 |
multimodal |
|
|
| 25 |
LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM |
提出知识引导的数据生成框架KgDG,提升开源法律LLM的推理能力 |
large language model |
✅ |
|
| 26 |
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection |
提出自适应Prompt组合方法,用于提升社交偏见检测任务的性能。 |
large language model |
|
|
| 27 |
KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment |
KARMA:利用多智能体LLM自动丰富知识图谱 |
large language model |
|
|
| 28 |
Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art |
提出基于LLM的专利新颖性评估方法,并构建首个相关数据集。 |
large language model |
|
|
| 29 |
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia |
提出SeaExam和SeaBench,用于评估LLM在东南亚本地多语言场景下的能力。 |
large language model |
|
|
| 30 |
Krutrim LLM: Multilingual Foundational Model for over a Billion People |
Krutrim LLM:为十亿人口设计的印度语多语言基础模型 |
foundation model |
|
|
| 31 |
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE |
Jakiro:利用MoE解耦多头注意力机制,加速推测解码并提升精度。 |
large language model |
✅ |
|
| 32 |
Emergent Response Planning in LLMs |
揭示LLM涌现的响应规划能力:隐藏层编码未来输出属性 |
large language model |
|
|
| 33 |
Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement |
揭示LLM在处理标注不一致的冒犯性语言检测中的能力与过度自信问题 |
large language model |
|
|
| 34 |
Scaling Public Health Text Annotation: Zero-Shot Learning vs. Crowdsourcing for Improved Efficiency and Labeling Accuracy |
探索LLM零样本学习在公共健康文本标注中的应用,提升效率并评估标注准确性。 |
large language model |
|
|
| 35 |
LegalViz: Legal Text Visualization by Text To Diagram Generation |
提出LegalViz数据集,用于法律文本到易理解图表的生成,提升法律知识可访问性。 |
large language model |
|
|
| 36 |
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs |
提出LCIRC,通过循环压缩和查询依赖建模高效处理LLM中的长文本上下文。 |
large language model |
|
|