cs.CL(2025-07-22)

📊 共 23 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (18 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
1 P-CoT: A Pedagogically-motivated Participatory Chain-of-Thought Prompting for Phonological Reasoning in LLMs 提出P-CoT提示方法,提升LLM在音韵推理任务上的性能 large language model chain-of-thought
2 Argument Quality Annotation and Gender Bias Detection in Financial Communication through Large Language Models 利用大语言模型评估金融文本论证质量并检测性别偏见 large language model
3 Towards Automated Regulatory Compliance Verification in Financial Auditing with Large Language Models 利用大型语言模型实现金融审计中监管合规的自动化验证 large language model
4 Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language 提出德语性别偏见评估数据集,揭示多语言LLM中的独特挑战 large language model
5 Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning Agentar-Fin-R1:通过领域知识、高效训练和高级推理增强金融智能 large language model foundation model
6 Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge? 提出工具增强的AI评估系统,提升LLM在事实性、数学和代码任务上的评估质量。 large language model
7 Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks 提出基于姓名的偏见评测方法,揭示LLM中隐藏的国籍偏见问题 large language model
8 LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMs LingBench++:一个语言学驱动的LLM多步推理与跨文化推断基准 large language model
9 Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent 提出Test-Time-Matching框架,无需训练即可实现LLM角色扮演语言代理的个性化定制。 large language model
10 Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning 提出线程推理模型TIM,突破LLM上下文长度限制,实现长程推理 large language model
11 How Deep Is Representational Bias in LLMs? The Cases of Caste and Religion 系统审计GPT-4 Turbo以揭示LLMs中的表现偏见 large language model
12 PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization 提出PICACO以解决大语言模型的多元价值对齐问题 large language model
13 The Ever-Evolving Science Exam 提出EESE:一个动态演进的科学考试基准,用于可靠评估基础模型的科学理解能力。 foundation model
14 ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs 提出ICR Probe,通过追踪LLM隐状态动态变化实现可靠的幻觉检测 large language model
15 Towards Enforcing Company Policy Adherence in Agentic Workflows 提出一种可执行公司策略的Agent工作流框架,解决LLM Agent策略遵循问题 large language model
16 Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction 提出AOE基准,评估LLM从复杂文档中抽取结构化表格信息的能力 large language model
17 iShumei-Chinchunmei at SemEval-2025 Task 4: A balanced forgetting and retention multi-task framework using effective unlearning loss 提出有效遗忘损失,平衡LLM的遗忘与保留能力,解决敏感内容擦除问题。 large language model
18 Towards Compute-Optimal Many-Shot In-Context Learning 针对长文本In-Context Learning,提出计算优化的多示例选择策略 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
19 Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs 提出SALU,利用RLHF提升LLM在对话信息检索中对无法回答问题的识别能力和可信赖回复生成。 reinforcement learning RLHF large language model
20 Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny Re:Form:通过形式化验证反馈和强化学习,减少LLM软件验证对人工先验的依赖 reinforcement learning large language model chain-of-thought
21 Efficient RL for optimizing conversation level outcomes with an LLM-based tutor 提出基于LLM的对话式辅导强化学习方法,优化长期学生学习效果 reinforcement learning RLHF large language model
22 Turning Internal Gap into Self-Improvement: Promoting the Generation-Understanding Unification in MLLMs 提出基于内部差距的自提升框架,提升多模态大语言模型生成能力并促进统一。 DPO curriculum learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
23 Pixels to Principles: Probing Intuitive Physics Understanding in Multimodal Language Models 评估多模态大语言模型在直觉物理任务中的理解能力,揭示视觉-语言对齐问题。 physically plausible large language model multimodal

⬅️ 返回 cs.CL 首页 · 🏠 返回主页