cs.CL（2025-07-22）

📊 共 23 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (18 🔗4) 支柱二：RL算法与架构 (RL & Architecture) (4) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (18 篇)

#	题目	一句话要点	标签	🔗	⭐
1	P-CoT: A Pedagogically-motivated Participatory Chain-of-Thought Prompting for Phonological Reasoning in LLMs	提出P-CoT提示方法，提升LLM在音韵推理任务上的性能	large language model chain-of-thought
2	Argument Quality Annotation and Gender Bias Detection in Financial Communication through Large Language Models	利用大语言模型评估金融文本论证质量并检测性别偏见	large language model
3	Towards Automated Regulatory Compliance Verification in Financial Auditing with Large Language Models	利用大型语言模型实现金融审计中监管合规的自动化验证	large language model
4	Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language	提出德语性别偏见评估数据集，揭示多语言LLM中的独特挑战	large language model
5	Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning	Agentar-Fin-R1：通过领域知识、高效训练和高级推理增强金融智能	large language model foundation model	✅
6	Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?	提出工具增强的AI评估系统，提升LLM在事实性、数学和代码任务上的评估质量。	large language model	✅
7	Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks	提出基于姓名的偏见评测方法，揭示LLM中隐藏的国籍偏见问题	large language model
8	LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMs	LingBench++：一个语言学驱动的LLM多步推理与跨文化推断基准	large language model
9	Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent	提出Test-Time-Matching框架，无需训练即可实现LLM角色扮演语言代理的个性化定制。	large language model
10	Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning	提出线程推理模型TIM，突破LLM上下文长度限制，实现长程推理	large language model
11	How Deep Is Representational Bias in LLMs? The Cases of Caste and Religion	系统审计GPT-4 Turbo以揭示LLMs中的表现偏见	large language model	✅
12	PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization	提出PICACO以解决大语言模型的多元价值对齐问题	large language model
13	The Ever-Evolving Science Exam	提出EESE：一个动态演进的科学考试基准，用于可靠评估基础模型的科学理解能力。	foundation model	✅
14	ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs	提出ICR Probe，通过追踪LLM隐状态动态变化实现可靠的幻觉检测	large language model
15	Towards Enforcing Company Policy Adherence in Agentic Workflows	提出一种可执行公司策略的Agent工作流框架，解决LLM Agent策略遵循问题	large language model
16	Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction	提出AOE基准，评估LLM从复杂文档中抽取结构化表格信息的能力	large language model
17	iShumei-Chinchunmei at SemEval-2025 Task 4: A balanced forgetting and retention multi-task framework using effective unlearning loss	提出有效遗忘损失，平衡LLM的遗忘与保留能力，解决敏感内容擦除问题。	large language model
18	Towards Compute-Optimal Many-Shot In-Context Learning	针对长文本In-Context Learning，提出计算优化的多示例选择策略	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs	提出SALU，利用RLHF提升LLM在对话信息检索中对无法回答问题的识别能力和可信赖回复生成。	reinforcement learning RLHF large language model
20	Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny	Re:Form：通过形式化验证反馈和强化学习，减少LLM软件验证对人工先验的依赖	reinforcement learning large language model chain-of-thought
21	Efficient RL for optimizing conversation level outcomes with an LLM-based tutor	提出基于LLM的对话式辅导强化学习方法，优化长期学生学习效果	reinforcement learning RLHF large language model
22	Turning Internal Gap into Self-Improvement: Promoting the Generation-Understanding Unification in MLLMs	提出基于内部差距的自提升框架，提升多模态大语言模型生成能力并促进统一。	DPO curriculum learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Pixels to Principles: Probing Intuitive Physics Understanding in Multimodal Language Models	评估多模态大语言模型在直觉物理任务中的理解能力，揭示视觉-语言对齐问题。	physically plausible large language model multimodal

⬅️ 返回 cs.CL 首页 · 🏠 返回主页