cs.CL(2026-04-30)

📊 共 40 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (34 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (34 篇)

#题目一句话要点标签🔗
1 DPN-LE: Dual Personality Neuron Localization and Editing for Large Language Models DPN-LE:通过双重人格神经元定位与编辑实现大语言模型的精准人格控制 large language model
2 HealthBench Professional: Evaluating Large Language Models on Real Clinician Chats HealthBench Professional:评估大型语言模型在真实临床医生对话中的表现 large language model
3 ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models 提出ScaleBox以解决大规模代码验证的准确性与效率问题 large language model
4 Exploring Applications of Transfer-State Large Language Models: Cognitive Profiling and Socratic AI Tutoring 探索迁移状态大语言模型的应用:认知画像与苏格拉底式AI辅导 large language model
5 Stable Behavior, Limited Variation: Persona Validity in LLM Agents for Urban Sentiment Perception 研究表明:LLM Agent在城市情感感知中,Persona设定虽稳定但差异有限 large language model multimodal
6 MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction MiniCPM-o 4.5:面向实时全双工全模态交互的轻量级大模型 large language model multimodal
7 Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior 构建认知数字阴影数据集,评估LLM在模拟社会辩论中的表现与偏见 large language model
8 Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding 提出TeCoD,利用模板约束解码提升Text-to-SQL在复杂场景下的准确率和效率。 large language model
9 Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future 综述:探讨大型语言模型在同行评审流程中的应用、评估及未来发展 large language model
10 Instruction-Guided Poetry Generation in Arabic and Its Dialects 提出InstructPoet-AR,实现阿拉伯语及其方言中指令引导的可控诗歌生成 large language model
11 APPSI-139: A Parallel Corpus of English Application Privacy Policy Summarization and Interpretation 构建高质量英文隐私政策摘要与解读平行语料库APPSI-139,并提出混合框架TCSI-pp-V2。 large language model
12 Debiasing Reward Models via Causally Motivated Inference-Time Intervention 提出因果干预的奖励模型去偏方法,提升大语言模型对齐效果。 large language model
13 Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation DriftBench:揭示多轮LLM迭代中约束违反问题,并提出知识-违反率(KBV)指标。 large language model
14 Reasoning over Object Descriptions Improves Coreference Resolution in Task-Based Dialogue Systems 提出基于对象描述推理的LLM方法,提升任务型对话系统中指代消解性能 large language model
15 ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training ZipCCL:通过通信集合的无损压缩加速LLM训练 large language model
16 Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments 利用LLM分析卢森堡语新闻评论中的语言意识形态,揭示多语社会身份构建 large language model
17 RoadMapper: A Multi-Agent System for Roadmap Generation of Solving Complex Research Problems 提出RoadMapper多智能体系统,提升LLM生成科研路线图能力,节省专家时间。 large language model
18 Entropy of Ukrainian 首次对乌克兰语进行熵值测量以评估语言复杂性 large language model
19 Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO Skills-Coach:通过无训练GRPO实现LLM智能体技能的自进化优化 large language model
20 A Reproducibility Study of LLM-Based Query Reformulation 对基于LLM的查询重构方法进行可复现性研究,揭示其在不同检索范式下的性能差异。 large language model
21 From Unstructured to Structured: LLM-Guided Attribute Graphs for Entity Search and Ranking 提出LLM驱动的属性图方法,用于提升电商场景下的实体搜索与排序。 large language model
22 Emotion-Aware Clickbait Attack in Social Media 提出情感感知型Clickbait攻击框架,通过优化情感影响绕过现有检测系统。 large language model
23 LLMs Capture Emotion Labels, Not Emotion Uncertainty: Distributional Analysis and Calibration of Human--LLM Judgment Gaps 研究表明LLM主要捕捉情感标签而非情感不确定性,并提出校准方法缩小人机差距 large language model
24 To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing 提出结构感知自适应编辑方法AdaEdit,提升LLM代码编辑效率并降低成本。 large language model
25 What Don't You Understand? Using Large Language Models to Identify and Characterize Student Misconceptions About Challenging Topics 利用大型语言模型识别并分析学生对生物医学科学难题的误解 large language model
26 Exploring Applications of Transfer-State Large Language Models: Cognitive Profiling and Socratic AI Tutoring 探索迁移状态大语言模型的应用:认知画像与苏格拉底式AI辅导 large language model
27 Retrieval-Augmented Reasoning for Chartered Accountancy CA-ThinkFlow:面向印度特许会计的检索增强推理框架 large language model chain-of-thought
28 ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts 提出ViLegalNLI,首个大规模越南语法律自然语言推理数据集,促进法律文本理解。 large language model
29 Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions 揭示LLM战略博弈困境:观察、信念与行动间的断裂 large language model
30 How Frontier LLMs Adapt to Neurodivergence Context: A Measurement Framework for Surface vs. Structural Change in System-Prompted Responses NDBench:评估前沿LLM在神经多样性语境下的适应性及结构性调整 large language model
31 Estimating LLM Grading Ability and Response Difficulty in Automatic Short Answer Grading via Item Response Theory 基于项目反应理论提出LLM自动短答案评分能力评估方法 large language model
32 Confidence Estimation in Automatic Short Answer Grading with LLMs 提出混合置信度框架,提升LLM在自动短答案评分中的可靠性 large language model
33 RouteProfile: Elucidating the Design Space of LLM Profiles for Routing 提出RouteProfile以优化LLM路由性能 large language model
34 LLMs Capture Emotion Labels, Not Emotion Uncertainty: Distributional Analysis and Calibration of Human-LLM Judgment Gaps 研究表明:大语言模型擅长捕捉情感标签,但难以模拟情感不确定性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
35 TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning TwinGate:通过非对称对比学习防御不可追踪流量中的分解式越狱攻击 contrastive learning large language model
36 Perturbation Probing: A Two-Pass-per-Prompt Diagnostic for FFN Behavioral Circuits in Aligned LLMs 提出扰动探测方法以诊断对齐大模型中的FFN行为电路 RLHF large language model
37 Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents 提出RSCB-MC,解决LLM代码Agent中记忆检索的风险控制问题 reward design large language model
38 From Coarse to Fine: Benchmarking and Reward Modeling for Writing-Centric Generation Tasks 提出WEval评估体系和WRL训练框架,提升写作生成任务中奖励模型的细粒度控制能力。 reinforcement learning large language model
39 Lost in State Space: Probing Frozen Mamba Representations 探究Frozen Mamba表征:状态空间中语义信息提取的局限性 Mamba SSM

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
40 Timing is Everything: Temporal Scaffolding of Semantic Surprise in Humor 提出双重预测违背框架,揭示时间结构在幽默理解中的关键作用 HuMoR

⬅️ 返回 cs.CL 首页 · 🏠 返回主页