cs.CL(2025-12-02)

📊 共 27 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (22 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (5)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (22 篇)

#题目一句话要点标签🔗
1 The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models 提出道德一致性管道(MoCoP),用于持续评估大型语言模型的伦理道德 large language model
2 Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic 提出Lang2Logic框架,利用微调大语言模型减少逻辑翻译中的幻觉问题 large language model
3 BOOM: Beyond Only One Modality KIT's Multimodal Multilingual Lecture Companion 提出BOOM以解决多模态多语言讲座内容本地化问题 multimodal
4 A benchmark dataset for evaluating Syndrome Differentiation and Treatment in large language models 构建中医领域大型语言模型评测基准TCM-BEST4SDT,用于评估辨证论治能力。 large language model
5 Towards Unification of Hallucination Detection and Fact Verification for Large Language Models 提出UniFact统一框架,弥合LLM幻觉检测与事实验证的研究鸿沟 large language model
6 PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models PEFT-Factory:统一自回归大语言模型的高效参数微调框架 large language model
7 Spoken Conversational Agents with Large Language Models 语音对话Agent正向语音原生LLM演进,本教程提供系统级路线图。 large language model
8 TaleFrame: An Interactive Story Generation System with Fine-Grained Control and Large Language Models TaleFrame:结合大语言模型与人机交互的细粒度可控交互式故事生成系统 large language model
9 Emergent Bayesian Behaviour and Optimal Cue Combination in LLMs 提出BayesBench基准测试,评估LLM在多模态感知任务中的贝叶斯行为和最优线索组合能力 large language model multimodal
10 Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs 提出随机掩码微调以解决大语言模型中的隐私泄露问题 large language model
11 Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks SU S VI B E S基准测试揭示Agent生成代码在真实软件工程任务中存在严重安全漏洞 large language model
12 InvertiTune: High-Quality Data Synthesis for Cost-Effective Single-Shot Text-to-Knowledge Graph Generation InvertiTune:通过高质量数据合成,实现高性价比的单次文本到知识图谱生成 large language model
13 Enhancing Job Matching: Occupation, Skill and Qualification Linking with the ESCO and EQF taxonomies 利用语言模型增强职位匹配,连接职业、技能与欧洲分类体系 large language model
14 Variance-Aware LLM Annotation for Strategy Research: Sources, Diagnostics, and a Protocol for Reliable Measurement 提出方差感知LLM标注协议,提升策略研究中文本标注的可靠性与可复现性 large language model
15 Cross-Lingual Prompt Steerability: Towards Accurate and Robust LLM Behavior across Languages 提出跨语言提示可控性框架,提升LLM在多语言环境下的准确性和鲁棒性 large language model
16 promptolution: A Unified, Modular Framework for Prompt Optimization 提出promptolution,一个统一模块化的Prompt优化框架,提升大语言模型性能。 large language model
17 Noise-Driven Persona Formation in Reflexive Neural Language Generation 提出Luca-Noise反射协议,研究噪声驱动的大语言模型人格涌现 large language model
18 CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer CREST:通过聚类引导的跨语言迁移实现通用安全防护 large language model
19 An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation 模型融合算法用于缓解社会偏见:一项针对LLM的实证研究 large language model
20 Input Order Shapes LLM Semantic Alignment in Multi-Document Summarization 多文档摘要中输入顺序影响LLM的语义对齐,首篇文档具有显著优先效应 large language model
21 LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems 提出LeechHijack攻击,揭示智能体系统中第三方工具的隐式资源劫持风险。 large language model
22 When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers 研究LLM作为解决方案验证器的有效性,揭示跨模型验证的优势与后训练的影响。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
23 DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models DeepSeek-V3.2:通过稀疏注意力、强化学习和Agent任务合成,提升大语言模型的推理和Agent性能。 reinforcement learning IMoS large language model
24 Separating Constraint Compliance from Semantic Accuracy: A Novel Benchmark for Evaluating Instruction-Following Under Compression 提出CDCT基准,揭示压缩条件下LLM指令遵循中约束遵从与语义准确的权衡。 RLHF large language model instruction following
25 SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment 提出SR-GRPO,利用稳定秩作为内在奖励信号对LLM进行无监督对齐 reinforcement learning large language model
26 From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks 提出CAPO:一种基于课程优势的策略优化方法,提升跨领域推理能力 reinforcement learning PPO imitation learning
27 ADORE: Autonomous Domain-Oriented Relevance Engine for E-commerce ADORE:电商领域自主领域导向相关性引擎,解决数据稀缺和语义鸿沟问题。 distillation chain-of-thought

⬅️ 返回 cs.CL 首页 · 🏠 返回主页