| 1 |
S3-CoT: Self-Sampled Succinct Reasoning Enables Efficient Chain-of-Thought LLMs |
S3-CoT:自采样精简推理,实现高效思维链大语言模型 |
large language model chain-of-thought |
✅ |
|
| 2 |
Abstract Activation Spaces for Content-Invariant Reasoning in Large Language Models |
提出抽象激活空间框架,提升大语言模型在内容无关推理中的鲁棒性 |
large language model |
|
|
| 3 |
Game of Thought: Robust Information Seeking with Large Language Models Using Game Theory |
提出Game of Thought框架,利用博弈论提升大语言模型在信息搜寻任务中的鲁棒性。 |
large language model |
|
|
| 4 |
Large Language Models for Mental Health: A Multilingual Evaluation |
多语言心理健康领域:评估大型语言模型性能与翻译质量影响 |
large language model |
|
|
| 5 |
Evaluating Metalinguistic Knowledge in Large Language Models across the World's Languages |
构建多语言元语言知识评估基准,揭示大语言模型在不同语言上的结构理解能力差异 |
large language model |
|
|
| 6 |
AR-MAP: Are Autoregressive Large Language Models Implicit Teachers for Diffusion Large Language Models? |
AR-MAP:利用自回归大语言模型作为扩散大语言模型的隐式教师,实现高效偏好对齐。 |
large language model |
✅ |
|
| 7 |
There Is More to Refusal in Large Language Models than a Single Direction |
揭示大语言模型拒绝行为的复杂性:并非单一激活方向控制 |
large language model |
|
|
| 8 |
Orthogonal Hierarchical Decomposition for Structure-Aware Table Understanding with Large Language Models |
提出正交分层分解框架,提升LLM对复杂表格的理解与推理能力 |
large language model |
|
|
| 9 |
Data Distribution Matters: A Data-Centric Perspective on Context Compression for Large Language Model |
从数据分布角度出发,研究数据分布对大语言模型上下文压缩质量的影响。 |
large language model |
|
|
| 10 |
Mechanistic Indicators of Steering Effectiveness in Large Language Models |
利用内部模型信号诊断大语言模型steering有效性,提升行为控制可靠性 |
large language model |
|
|
| 11 |
Steering Vector Fields for Context-Aware Inference-Time Control in Large Language Models |
提出Steering Vector Fields,解决大语言模型推理时控制向量的不可靠性问题 |
large language model |
|
|
| 12 |
LEC-KG: An LLM-Embedding Collaborative Framework for Domain-Specific Knowledge Graph Construction -- A Case Study on SDGs |
提出LEC-KG框架,利用LLM与知识图谱嵌入协同构建领域知识图谱,以解决可持续发展目标知识图谱构建难题。 |
large language model chain-of-thought |
|
|
| 13 |
ROG: Retrieval-Augmented LLM Reasoning for Complex First-Order Queries over Knowledge Graphs |
提出ROG:一种检索增强的LLM推理框架,用于解决知识图谱上的复杂一阶逻辑查询 |
large language model chain-of-thought |
|
|
| 14 |
Cross-Lingual Stability of LLM Judges Under Controlled Generation: Evidence from Finno-Ugric Languages |
研究表明,在芬兰-乌戈尔语系中,LLM评判器跨语言稳定性不足,尤其在语用判断上。 |
large language model instruction following |
✅ |
|
| 15 |
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding |
CodeOCR:探索视觉语言模型在代码理解中的有效性,实现高效代码表示。 |
large language model multimodal |
|
|
| 16 |
Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics |
提出统一视角以优化语言模型控制方法 |
large language model |
✅ |
|
| 17 |
Reward-free Alignment for Conflicting Objectives |
提出RACO框架,通过冲突规避梯度下降解决多目标LLM对齐问题 |
large language model |
|
|
| 18 |
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents |
MemSkill:学习和进化记忆技能,赋能自进化Agent |
large language model |
|
|
| 19 |
OpenSeal: Good, Fast, and Cheap Construction of an Open-Source Southeast Asian LLM via Parallel Data |
OpenSeal:通过并行数据高效构建开源东南亚语言大模型 |
large language model |
|
|
| 20 |
Towards AI Evaluation in Domain-Specific RAG Systems: The AgriHubi Case Study |
AgriHubi:面向芬兰语农业决策支持的领域RAG系统及评估方法 |
large language model |
|
|
| 21 |
Automated Multiple Mini Interview (MMI) Scoring |
提出多智能体提示框架,提升大型语言模型在自动化多重迷你面试评分中的表现 |
large language model |
|
|
| 22 |
Language Steering for Multilingual In-Context Learning |
提出语言向量引导方法,提升多语言上下文学习中非英语语言性能。 |
large language model |
|
|
| 23 |
The Shape of Beliefs: Geometry, Dynamics, and Interventions along Representation Manifolds of Language Models' Posteriors |
研究LLM信念的几何结构,提出场感知线性引导方法以提升干预效果 |
large language model |
|
|
| 24 |
Am I More Pointwise or Pairwise? Revealing Position Bias in Rubric-Based LLM-as-a-Judge |
揭示基于评分细则的LLM评判中的位置偏差,并提出平衡排列策略进行缓解。 |
large language model |
|
|
| 25 |
Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing |
提出Focus-dLLM,通过置信度引导的上下文聚焦加速长文本扩散LLM推理。 |
large language model |
✅ |
|
| 26 |
Out of the Memory Barrier: A Highly Memory Efficient Training System for LLMs with Million-Token Contexts |
OOMB:一种高内存效率的LLM训练系统,支持百万token上下文 |
large language model |
✅ |
|
| 27 |
Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs |
发布Dicta-LM 3.0:推进希伯来语主权LLM前沿,提供多种尺寸和工具调用支持。 |
large language model |
|
|
| 28 |
Beyond Local Edits: Embedding-Virtualized Knowledge for Broader Evaluation and Preservation of Model Editing |
提出嵌入虚拟知识以解决模型编辑评估不足问题 |
large language model |
|
|
| 29 |
From Code-Centric to Concept-Centric: Teaching NLP with LLM-Assisted "Vibe Coding" |
提出LLM辅助的“Vibe Coding”教学法,提升NLP概念理解。 |
large language model |
|
|
| 30 |
AXE: Low-Cost Cross-Domain Web Structured Information Extraction |
AXE:提出一种低成本的跨领域网页结构化信息抽取方法,利用小型LLM实现高效抽取。 |
large language model |
|
|
| 31 |
WorldCup Sampling for Multi-bit LLM Watermarking |
提出WorldCup,通过层级竞争采样实现多比特LLM水印嵌入,提升容量与鲁棒性。 |
large language model |
|
|
| 32 |
ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation |
提出ARTIS,通过迭代模拟实现Agent在测试时风险感知的计算资源分配,提升Agent可靠性。 |
large language model |
|
|
| 33 |
The Art of Socratic Inquiry: A Framework for Proactive Template-Guided Therapeutic Conversation Generation |
提出苏格拉底探究框架,提升LLM在认知行为治疗中的主动引导能力 |
large language model |
|
|
| 34 |
Provable Defense Framework for LLM Jailbreaks via Noise-Augumented Alignment |
提出噪声增强对齐的认证防御框架,提升LLM抵抗恶意越狱攻击的鲁棒性。 |
large language model |
|
|
| 35 |
LLM-based Embeddings: Attention Values Encode Sentence Semantics Better Than Hidden States |
提出基于LLM注意力值的句子嵌入方法,优于基于隐藏状态的方法 |
large language model |
|
|
| 36 |
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents |
FS-Researcher:面向长程研究任务,基于文件系统的LLM智能体测试时扩展框架 |
large language model |
✅ |
|
| 37 |
Argument Rarity-based Originality Assessment for AI-Assisted Writing |
提出基于论证稀有度的原创性评估框架AROA,用于评估AI辅助写作中学生文章的原创性。 |
large language model |
|
|