| 1 |
CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models |
CRANE:通过因果相关性分析多语言大模型中特定语言神经元 |
large language model language conditioned |
|
|
| 2 |
V-FAT: Benchmarking Visual Fidelity Against Text-bias |
V-FAT基准测试揭示多模态大语言模型中文本偏差下的视觉保真度问题。 |
large language model multimodal visual grounding |
|
|
| 3 |
See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation |
提出基于生成式AI和少量样本学习的多模态框架,用于检测、解释和干预仇恨表情包。 |
multimodal |
|
|
| 4 |
BanglaLorica: Design and Evaluation of a Robust Watermarking Algorithm for Large Language Models in Bangla Text Generation |
BanglaLorica:针对孟加拉语LLM文本生成,提出一种鲁棒的水印算法并进行评估 |
large language model |
|
|
| 5 |
Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei |
提出子文化对齐求解器以解决自毁亚文化行为检测问题 |
large language model |
|
|
| 6 |
Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain Generalization |
利用负样本推理提升大语言模型领域外泛化能力 |
large language model chain-of-thought |
|
|
| 7 |
THaLLE-ThaiLLM: Domain-Specialized Small LLMs for Finance and Thai -- Technical Report |
THaLLE-ThaiLLM:面向金融和泰语的领域专用小型LLM,通过模型合并实现多功能性。 |
large language model instruction following |
|
|
| 8 |
Measuring and Fostering Peace through Machine Learning and Artificial Intelligence |
利用机器学习和人工智能测量并促进和平 |
large language model |
|
|
| 9 |
RelayLLM: Efficient Reasoning via Collaborative Decoding |
RelayLLM:提出一种基于协同解码的高效推理框架,显著降低大语言模型的计算成本。 |
large language model |
|
|
| 10 |
CuMA: Aligning LLMs with Sparse Cultural Values via Demographic-Aware Mixture of Adapters |
提出CuMA,通过人口统计学感知的适配器混合模型对齐LLM与稀疏文化价值观 |
large language model |
✅ |
|
| 11 |
Differential syntactic and semantic encoding in LLMs |
通过分析LLM内部表征,揭示句法和语义信息的差异化编码方式 |
large language model |
|
|
| 12 |
Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning |
提出先验引导的零阶优化方法,高效微调大规模语言模型 |
large language model |
|
|
| 13 |
SampoNLP: A Self-Referential Toolkit for Morphological Analysis of Subword Tokenizers |
SampoNLP:一种自参照工具包,用于亚词分词器的形态分析 |
large language model |
✅ |
|
| 14 |
Agent-as-a-Judge |
提出Agent-as-a-Judge框架,提升复杂AI评估的可靠性与可验证性 |
large language model |
|
|
| 15 |
Belief in Authority: Impact of Authority in Multi-Agent Evaluation Framework |
首个多智能体评估框架研究:角色权威性偏见对智能体交互的影响分析 |
large language model |
|
|
| 16 |
NC2C: Automated Convexification of Generic Non-Convex Optimization Problems |
NC2C:利用LLM自动凸化通用非凸优化问题,提升求解效率。 |
large language model |
|
|
| 17 |
PILOT-Bench: A Benchmark for Legal Reasoning in the Patent Domain with IRAC-Aligned Classification Tasks |
提出PILOT-Bench:一个专利领域法律推理的IRAC对齐分类基准 |
large language model |
✅ |
|
| 18 |
RiskAtlas: Exposing Domain-Specific Risks in LLMs through Knowledge-Graph-Guided Harmful Prompt Generation |
RiskAtlas:通过知识图谱引导的有害提示生成,揭示LLM在特定领域的风险 |
large language model |
|
|
| 19 |
DSC2025 -- ViHallu Challenge: Detecting Hallucination in Vietnamese LLMs |
DSC2025 ViHallu Challenge:首个越南语LLM幻觉检测大规模共享任务。 |
large language model |
|
|
| 20 |
ToolGate: Contract-Grounded and Verified Tool Execution for LLMs |
ToolGate:面向LLM工具执行的、基于合约验证的安全框架 |
large language model |
|
|
| 21 |
Semantically Orthogonal Framework for Citation Classification: Disentangling Intent and Content |
提出SOFT框架,解耦引用意图与内容类型,提升引文分类效果 |
large language model |
✅ |
|
| 22 |
Multi-Disciplinary Dataset Discovery from Citation-Verified Literature Contexts |
提出一种基于引文语境的多学科数据集发现框架,提升数据集检索召回率。 |
large language model |
✅ |
|
| 23 |
GenProve: Learning to Generate Text with Fine-Grained Provenance |
GenProve:提出一种生成文本并提供细粒度来源信息的框架,解决LLM幻觉问题。 |
large language model |
|
|
| 24 |
Faithful Summarisation under Disagreement via Belief-Level Aggregation |
提出基于信念层聚合的框架,解决意见型摘要中现有方法忽略观点冲突的问题。 |
large language model |
|
|
| 25 |
Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis |
提出Mind2Report,模拟商业分析师,合成专家级商业报告 |
large language model |
✅ |
|
| 26 |
Fame Fades, Nature Remains: Disentangling the Character Identity of Role-Playing Agents |
提出角色身份解耦框架,区分参数化和属性化身份,提升角色扮演Agent的真实性。 |
large language model |
|
|
| 27 |
PRISM: A Unified Framework for Post-Training LLMs Without Verifiable Rewards |
PRISM:一种无需可验证奖励的LLM后训练统一框架 |
large language model |
|
|
| 28 |
Thunder-KoNUBench: A Corpus-Aligned Benchmark for Korean Negation Understanding |
提出Thunder-KoNUBench以解决韩语否定理解问题 |
large language model |
|
|
| 29 |
LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation |
LinguaGame:一种基于语言学和博弈论的多智能体对话生成范式 |
large language model |
|
|