| 1 |
Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs |
提出内部思维链以提升大语言模型的任务执行透明度 |
large language model chain-of-thought |
|
|
| 2 |
ABBA-Adapters: Efficient and Expressive Fine-Tuning of Foundation Models |
提出ABBA以解决大语言模型高效适应新领域的问题 |
large language model foundation model |
✅ |
|
| 3 |
EfficientLLM: Efficiency in Large Language Models |
提出EfficientLLM以解决大语言模型效率问题 |
large language model foundation model |
|
|
| 4 |
ModRWKV: Transformer Multimodality in Linear Time |
提出ModRWKV以解决多模态学习中的计算复杂性问题 |
large language model multimodal |
|
|
| 5 |
Enhanced Multimodal Aspect-Based Sentiment Analysis by LLM-Generated Rationales |
提出LRSA框架以解决多模态情感分析中的信息整合问题 |
large language model multimodal |
|
|
| 6 |
CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring |
提出CAFES框架以解决多模态自动作文评分问题 |
large language model multimodal |
|
|
| 7 |
DecIF: Improving Instruction-Following through Meta-Decomposition |
提出DecIF框架以解决指令跟随数据生成的灵活性问题 |
large language model instruction following |
|
|
| 8 |
Large Language Models Implicitly Learn to See and Hear Just By Reading |
提出通过文本训练实现视觉与听觉理解的长语言模型 |
large language model |
|
|
| 9 |
Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models |
提出Saten以解决大语言模型压缩问题 |
large language model |
|
|
| 10 |
Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning |
提出N-rep一致性以降低文本到SQL转换成本 |
chain-of-thought |
|
|
| 11 |
Scaling Laws for State Dynamics in Large Language Models |
探讨大语言模型状态动态的规模法则 |
large language model |
|
|
| 12 |
Toward Reliable Scientific Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models |
提出TruthHypo与KnowHD以解决科学假设生成的真实性问题 |
large language model |
✅ |
|
| 13 |
Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations |
提出SDA框架以解决代码混合下LLM的安全性问题 |
large language model |
|
|
| 14 |
Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models |
提出LaTen以解决大规模语言模型间知识转移问题 |
large language model |
✅ |
|
| 15 |
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models |
提出DiagnosisArena以评估大型语言模型的诊断推理能力 |
large language model |
✅ |
|
| 16 |
Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems |
提出用户体验评估量表以优化多模态对话系统 |
multimodal |
|
|
| 17 |
Multimodal Cultural Safety: Evaluation Framework and Alignment Strategies |
提出CROSS基准以评估大型视觉语言模型的文化安全性 |
multimodal |
|
|
| 18 |
DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis |
提出DECASTE框架以揭示大语言模型中的种姓偏见 |
large language model |
|
|
| 19 |
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples |
提出LISTEN以解决音频感知大语言模型的幻觉问题 |
large language model |
|
|
| 20 |
S2SBench: A Benchmark for Quantifying Intelligence Degradation in Speech-to-Speech Large Language Models |
提出S2SBench以量化语音到语音大语言模型的智能退化问题 |
large language model |
✅ |
|
| 21 |
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models Benchmarking |
提出OmniGenBench以解决基因组基础模型评估的可重复性问题 |
foundation model |
|
|
| 22 |
QA-prompting: Improving Summarization with Large Language Models using Question-Answering |
提出QA-prompting以解决长文本摘要中的位置信息偏差问题 |
large language model |
|
|
| 23 |
Cross-Lingual Optimization for Language Transfer in Large Language Models |
提出跨语言优化方法以解决大语言模型语言迁移问题 |
large language model |
|
|
| 24 |
Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification |
提出统一框架分析大语言模型在作者隐私中的作用 |
large language model |
|
|
| 25 |
Beyond Chains: Bridging Large Language Models and Knowledge Bases in Complex Question Answering |
提出PDRR框架以解决复杂问答中的知识库整合问题 |
large language model |
|
|
| 26 |
ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs |
提出ShieldVLM以解决多模态隐性毒性检测问题 |
multimodal |
|
|
| 27 |
AUTOLAW: Enhancing Legal Compliance in Large Language Models via Case Law Generation and Jury-Inspired Deliberation |
提出AutoLaw以解决法律合规性问题 |
large language model |
|
|
| 28 |
Activation-Guided Consensus Merging for Large Language Models |
提出激活引导共识合并以提升大语言模型的效率与稳定性 |
large language model |
|
|
| 29 |
Mixed Signals: Understanding Model Disagreement in Multimodal Empathy Detection |
提出多模态模型以解决同类信号冲突问题 |
multimodal |
|
|
| 30 |
Informatics for Food Processing |
提出FoodProX模型以解决食品加工分类的主观性问题 |
large language model multimodal |
|
|
| 31 |
Amadeus-Verbo Technical Report: The powerful Qwen2.5 family models trained in Portuguese |
提出Amadeus Verbo模型以促进巴西葡萄牙语的开放源代码发展 |
large language model foundation model |
✅ |
|
| 32 |
PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs |
提出PersonaTAB以解决个性化对话系统缺乏个性标注的问题 |
large language model TAMP |
|
|
| 33 |
Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst |
提出自推理语言模型以提升复杂推理任务的性能 |
large language model chain-of-thought |
|
|
| 34 |
Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLM |
提出图基分析框架以提升大语言模型推理能力 |
large language model chain-of-thought |
|
|
| 35 |
Too Long, Didn't Model: Decomposing LLM Long-Context Understanding With Novels |
提出TLDM基准以评估LLM在长上下文理解中的表现 |
large language model |
|
|
| 36 |
EasyMath: A 0-shot Math Benchmark for SLMs |
提出EasyMath基准以评估小型语言模型的数学推理能力 |
chain-of-thought |
|
|
| 37 |
Automated Journalistic Questions: A New Method for Extracting 5W1H in French |
提出自动化提取法以解决法语新闻5W1H信息提取问题 |
large language model |
|
|
| 38 |
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models |
提出UltraEdit以解决大规模语言模型的终身编辑问题 |
large language model |
✅ |
|
| 39 |
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications |
提出WirelessMathBench以评估LLMs在无线通信中的数学建模能力 |
large language model |
|
|
| 40 |
Temporal Alignment of Time Sensitive Facts with Activation Engineering |
提出激活工程以解决大语言模型的时间敏感性问题 |
large language model |
|
|
| 41 |
Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability |
研究量化对大语言模型可解释性与可理解性的影响 |
large language model |
|
|
| 42 |
Mechanistic Interpretability of GPT-like Models on Summarization Tasks |
提出机制可解释性框架以分析GPT模型在摘要任务中的表现 |
large language model |
|
|
| 43 |
WebNovelBench: Placing LLM Novelists on the Web Novel Distribution |
提出WebNovelBench以解决长篇小说生成评估问题 |
large language model |
|
|
| 44 |
Creative Preference Optimization |
提出创意偏好优化方法以提升LLM的创造力 |
large language model |
|
|
| 45 |
MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language |
提出MUG-Eval框架以评估多语言生成能力 |
large language model |
|
|
| 46 |
GemMaroc: Unlocking Darija Proficiency in LLMs with Minimal Data |
提出GemMaroc以解决摩洛哥阿拉伯语处理问题 |
large language model |
|
|
| 47 |
Tokenization Constraints in LLMs: A Study of Symbolic and Arithmetic Reasoning Limits |
提出Token Awareness以解决LLMs中的符号推理限制问题 |
chain-of-thought |
|
|
| 48 |
A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations |
提出PersonaConvBench以评估个性化对话生成能力 |
large language model |
|
|
| 49 |
GloSS over Toxicity: Understanding and Mitigating Toxicity in LLMs via Global Toxic Subspace |
提出GloSS以解决大语言模型中的毒性问题 |
large language model |
|
|
| 50 |
From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora |
提出多路平行语料库以提升多语言大模型性能 |
large language model |
|
|
| 51 |
FlashThink: An Early Exit Method For Efficient Reasoning |
提出FlashThink以解决大语言模型推理效率问题 |
large language model |
|
|
| 52 |
EEG-to-Text Translation: A Model for Deciphering Human Brain Activity |
提出R1 Translator以提升脑电图到文本翻译性能 |
large language model |
✅ |
|
| 53 |
ConspEmoLLM-v2: A robust and stable model to detect sentiment-transformed conspiracy theories |
提出ConspEmoLLM-v2以解决情感转变阴谋论检测问题 |
large language model |
✅ |
|
| 54 |
Concept Incongruence: An Exploration of Time and Death in Role Playing |
提出概念不一致性以分析角色扮演中的时间与死亡问题 |
large language model |
|
|
| 55 |
Incorporating Token Usage into Prompting Strategy Evaluation |
提出Big-$O_{tok}$框架以优化提示策略的效率评估 |
large language model |
|
|
| 56 |
SEPS: A Separability Measure for Robust Unlearning in LLMs |
提出SEPS框架以解决大语言模型的混合查询遗忘问题 |
large language model |
|
|
| 57 |
Tracing Multilingual Factual Knowledge Acquisition in Pretraining |
追踪多语言事实知识获取以提升语言模型的跨语言一致性 |
large language model |
✅ |
|
| 58 |
Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes |
系统研究语言混合对推理语言模型的影响及优化策略 |
chain-of-thought |
|
|
| 59 |
sudoLLM: On Multi-role Alignment of Language Models |
提出sudoLLM以解决语言模型的多角色对齐问题 |
large language model |
|
|
| 60 |
TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring |
提出TRATES以解决个体特征评估不足的问题 |
large language model |
|
|
| 61 |
Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders |
利用稀疏自编码器实现大型语言模型的去毒化 |
large language model |
|
|
| 62 |
MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance |
提出MoMoE框架以解决在线社区内容审核透明性问题 |
large language model |
|
|
| 63 |
Rank-K: Test-Time Reasoning for Listwise Reranking |
提出Rank-K以解决多语言查询的高效重排序问题 |
large language model |
|
|
| 64 |
From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning |
研究空间推理中的指令泛化挑战 |
large language model |
|
|
| 65 |
Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis |
提出PhantomCircuit以解决知识遮蔽问题 |
large language model |
|
|
| 66 |
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs |
提出有效的提示注入攻击以评估开源LLM的安全性 |
large language model |
|
|
| 67 |
Dual Decomposition of Weights and Singular Value Low Rank Adaptation |
提出DuDe以解决LoRA方法的训练不稳定和知识转移效率低的问题 |
large language model |
|
|
| 68 |
OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation |
提出OSoRA以解决大规模语言模型微调的计算资源挑战 |
large language model |
|
|
| 69 |
Teaching Small Language Models to Learn Logic through Meta-Learning |
通过元学习提升小型语言模型的逻辑推理能力 |
large language model |
|
|
| 70 |
JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling |
提出JOLT-SQL以解决文本到SQL映射中的噪声模式问题 |
large language model |
✅ |
|
| 71 |
Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs |
提出通用声学对抗攻击以灵活控制语音大语言模型 |
large language model |
|
|
| 72 |
ThinkSwitcher: When to Think Hard, When to Think Fast |
提出ThinkSwitcher以解决大规模推理模型的计算效率问题 |
chain-of-thought |
|
|
| 73 |
SlangDIT: Benchmarking LLMs in Interpretative Slang Translation |
提出SlangDIT以解决俚语翻译中的语境依赖问题 |
large language model |
|
|
| 74 |
The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models |
提出轻量级架构改进以解决字符级理解问题 |
large language model |
|
|
| 75 |
Legal Rule Induction: Towards Generalizable Principle Discovery from Analogous Judicial Precedents |
提出法律规则诱导方法以解决从判例中提取隐性原则的问题 |
large language model |
|
|
| 76 |
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations |
提出MultiHal以解决多语言知识图谱基础的LLM幻觉评估问题 |
large language model |
|
|
| 77 |
BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks |
提出BAR代理以解决复杂Minecraft任务中的推理问题 |
large language model |
|
|
| 78 |
Enhancing LLMs via High-Knowledge Data Selection |
提出高知识评分器以解决LLMs知识稀缺问题 |
large language model |
|
|
| 79 |
Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation |
提出多模态检索增强生成的隐私漏洞分析方法 |
multimodal |
|
|
| 80 |
Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology |
探讨语言家族与形态学在多语言NLP中的跨语言迁移作用 |
zero-shot transfer |
|
|
| 81 |
Let's Verify Math Questions Step by Step |
提出MathQ-Verify以解决数学问题验证的挑战 |
large language model |
✅ |
|
| 82 |
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks |
提出PandaGuard以系统评估LLM安全性应对越狱攻击 |
large language model |
|
|
| 83 |
Improve Language Model and Brain Alignment via Associative Memory |
通过联想记忆提升语言模型与大脑的对齐 |
large language model |
|
|