cs.CL(2025-09-25)

📊 共 35 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (25 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (25 篇)

#题目一句话要点标签🔗
1 ReviewScore: Misinformed Peer Review Detection with Large Language Models 提出ReviewScore以检测同行评审中的错误信息,提升评审质量。 large language model
2 Bounds of Chain-of-Thought Robustness: Reasoning Steps, Embed Norms, and Beyond 理论分析CoT推理的鲁棒性边界,揭示推理步数和嵌入范数的影响 chain-of-thought
3 CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis CLaw:构建中文法律知识基准,评估大语言模型在法律推理中的能力。 large language model
4 BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback 提出BESPOKE基准,用于诊断反馈驱动的搜索增强LLM个性化 large language model
5 PerHalluEval: Persian Hallucination Evaluation Benchmark for Large Language Models 提出PerHalluEval,首个波斯语LLM幻觉评估基准 large language model
6 SoM-1K: A Thousand-Problem Benchmark Dataset for Strength of Materials 提出SoM-1K材料力学基准数据集,评估并提升多模态工程问题中大模型的性能。 large language model foundation model multimodal
7 LLM-Based Support for Diabetes Diagnosis: Opportunities, Scenarios, and Challenges with GPT-5 利用GPT-5辅助糖尿病诊断,提升临床决策支持与患者理解 large language model multimodal
8 When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following 提出ManyIFEval和StyleMBPP基准,评估并预测LLM多指令遵循能力。 large language model instruction following
9 OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule OjaKV:利用Oja规则进行上下文感知在线低秩KV缓存压缩,提升长文本处理效率。 large language model
10 Hallucination-Resistant, Domain-Specific Research Assistant with Self-Evaluation and Vector-Grounded Retrieval 提出RA-FSM,一种抗幻觉、领域特定的研究助手,通过自评估和向量检索提升专家工作流效率。 large language model
11 Generation-Time vs. Post-hoc Citation: A Holistic Evaluation of LLM Attribution 对比生成时和后置引用,全面评估LLM的归因能力,为高风险场景提供选择依据。 large language model
12 On Code-Induced Reasoning in LLMs 系统性研究代码特性对LLM推理能力的影响,揭示结构与语义扰动的关键作用 large language model
13 One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning 揭示多语言环境下大语言模型道德推理的跨语言错位问题 large language model
14 Sycophancy Is Not One Thing: Causal Separation of Sycophantic Behaviors in LLMs 因LLM溜须拍马行为并非单一机制,论文提出因果分离方法以独立控制不同行为。 large language model
15 The role of synthetic data in Multilingual, Multi-cultural AI systems: Lessons from Indic Languages 提出Updesh数据集,利用合成数据提升多语言、多文化AI系统在印度语言上的性能。 instruction following
16 DisCoCLIP: A Distributional Compositional Tensor Network Encoder for Vision-Language Understanding DisCoCLIP:一种用于视觉-语言理解的分布组合张量网络编码器 multimodal
17 LLMTrace: A Corpus for Classification and Fine-Grained Localization of AI-Written Text LLMTrace:用于AI生成文本分类与精细定位的双语数据集 large language model
18 LLM Output Homogenization is Task Dependent 提出任务依赖的LLM输出同质化评估与缓解方法,提升功能多样性。 large language model
19 Eigen-1: Adaptive Multi-Agent Refinement with Monitor-Based RAG for Scientific Reasoning Eigen-1:基于Monitor的RAG自适应多智能体精炼,用于科学推理 large language model
20 Who's Laughing Now? An Overview of Computational Humour Generation and Explanation 计算幽默生成与解释综述:探索NLP在幽默理解与创造中的应用与挑战 large language model
21 Which Cultural Lens Do Models Adopt? On Cultural Positioning Bias and Agentic Mitigation in LLMs 揭示LLM文化定位偏差并提出基于Agent的偏见缓解方法 large language model
22 PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints PMark:基于通道约束的鲁棒无失真语义级水印方法 large language model
23 Behind RoPE: How Does Causal Mask Encode Positional Information? 揭示RoPE背后机制:因果掩码如何编码位置信息 large language model
24 Generative AI for FFRDCs 利用生成式AI加速FFRDC文本分析,提升政府机构效率与安全性 large language model
25 Analysis of instruction-based LLMs' capabilities to score and judge text-input problems in an academic setting 提出基于LLM的自动评分系统,用于评估学术文本输入问题,参考答案辅助效果最佳。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
26 Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective 针对大语言模型优化,提出多目标强化学习的远景框架 reinforcement learning large language model
27 Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction 提出ROC框架,将多模态关系抽取重构为检索任务,提升细粒度关系理解能力。 contrastive learning large language model multimodal
28 Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models 提出Painless Activation Steering (PAS),一种全自动、轻量级的后训练大语言模型激活向量调控方法。 reinforcement learning large language model
29 SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines SciReasoner:构建跨学科的科学推理基础模型 reinforcement learning reward shaping foundation model
30 Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning 提出双头推理蒸馏(DHRD),在不牺牲推理速度的前提下提升分类器精度。 distillation chain-of-thought
31 Learning to Reason with Mixture of Tokens 提出混合Token生成方法,提升LLM在可验证奖励强化学习中的推理能力。 reinforcement learning large language model chain-of-thought
32 Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning CASAL:对比激活引导的摊销学习,有效降低大语言模型幻觉 DPO large language model
33 A State-of-the-Art SQL Reasoning Model using RLVR 利用可验证奖励的强化学习,提出SQL推理模型RLVR,在BIRD数据集上达到SOTA。 reinforcement learning offline RL
34 RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards 提出RLBFF,结合人类反馈和可验证奖励,提升LLM对齐效果并支持推理时自定义原则。 reinforcement learning RLHF

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
35 Vision Language Models Cannot Plan, but Can They Formalize? 提出VLM作为形式化工具以解决多模态规划问题 open-vocabulary open vocabulary multimodal

⬅️ 返回 cs.CL 首页 · 🏠 返回主页