cs.CL(2025-12-19)

📊 共 18 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (14 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
1 CIFE: Code Instruction-Following Evaluation CIFE:提出代码指令遵循评估基准,衡量LLM在代码生成中对开发者约束的遵守程度 large language model instruction following
2 A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models 提出一种多阶段工作流,利用精调LLM进行营销内容审核,无需外部知识。 large language model
3 UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models UCoder:通过大语言模型内部探测实现无监督代码生成 large language model
4 Governance-Aware Hybrid Fine-Tuning for Multilingual Large Language Models 提出一种治理感知的混合微调框架,用于多语言大语言模型的低资源适应。 large language model
5 The Instruction Gap: LLMs get lost in Following Instruction 揭示大语言模型指令遵循差距,评估企业级RAG场景性能 large language model instruction following
6 Are Vision Language Models Cross-Cultural Theory of Mind Reasoners? 提出CulturalToM-VQA基准,评估视觉语言模型在跨文化心智理论推理上的能力。 chain-of-thought
7 Confidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion Detection 置信度-可信度加权的小LLM集成在情感检测中超越大型LLM large language model
8 WRAVAL -- WRiting Assist eVALuation WRAVAL:针对小型语言模型写作辅助能力的评估框架 large language model
9 AncientBench: Towards Comprehensive Evaluation on Excavated and Transmitted Chinese Corpora 提出AncientBench,用于全面评估模型对出土和传世古汉语语料的理解能力 large language model
10 Consistency-Aware Editing for Entity-level Unlearning in Language Models 提出一致性感知编辑框架CAE,用于语言模型中实体级别知识的有效擦除。 large language model
11 OpenAI GPT-5 System Card OpenAI发布GPT-5系统,通过智能路由和安全训练提升真实世界应用性能。 instruction following
12 DEER: A Benchmark for Evaluating Deep Research Agents on Expert Report Generation DEER:一个用于评估深度研究智能体生成专家报告的基准 large language model
13 When the Gold Standard isn't Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content 针对用户生成内容翻译评估标准不统一问题,提出一套标准性感知的评估框架。 large language model
14 Linear Personality Probing and Steering in LLMs: A Big Five Study 利用线性探针和引导实现LLM性格控制:基于五大人格特质的研究 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
15 Reinforcement Learning for Chain of Thought Compression with One-Domain-to-All Generalization 提出基于强化学习的思维链压缩方法,实现跨领域泛化和效率提升。 reinforcement learning large language model instruction following
16 ReGal: A First Look at PPO-based Legal AI for Judgment Prediction and Summarization in India 提出ReGal:一个基于PPO的印度法律AI框架,用于判决预测和摘要生成。 reinforcement learning PPO
17 Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Seed-Prover 1.5:通过经验学习掌握本科水平定理证明 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
18 Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers 提出Canon Layers,增强语言模型水平信息流动与推理能力 manipulation Mamba linear attention

⬅️ 返回 cs.CL 首页 · 🏠 返回主页