cs.CL(2025-09-01)

📊 共 21 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (18 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (3)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
1 Vis-CoT: A Human-in-the-Loop Framework for Interactive Visualization and Intervention in LLM Chain-of-Thought Reasoning Vis-CoT:人机协同交互式可视化LLM思维链推理框架 large language model chain-of-thought
2 CAT: Causal Attention Tuning For Injecting Fine-grained Causal Knowledge into Large Language Models 提出因果注意力调整(CAT)方法,将细粒度因果知识注入大型语言模型。 large language model
3 Trusted Uncertainty in Large Language Models: A Unified Framework for Confidence Calibration and Risk-Controlled Refusal 提出UniCR框架,通过校准不确定性证据实现大语言模型风险可控的拒绝回答。 large language model
4 On the Alignment of Large Language Models with Global Human Opinion 提出基于世界价值观调查的框架,评估大语言模型与全球人类意见的对齐程度。 large language model
5 Can Large Language Models Master Complex Card Games? 探索LLM在复杂卡牌游戏中的能力:通过精调实现类人智能 large language model
6 WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data 提出WATCHED,一种结合LLM与专业工具的AI Agent,用于辅助内容审核员打击网络仇恨言论。 large language model chain-of-thought
7 ShortageSim: Simulating Drug Shortages under Information Asymmetry ShortageSim:首个信息不对称下药品短缺监管干预的模拟框架 large language model
8 Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs 重新审视LLM的Prompt敏感性:评估方法伪像还是模型缺陷? large language model
9 Where Should I Study? Biased Language Models Decide! Evaluating Fairness in LMs for Academic Recommendations 提出多维评估框架以解决语言模型推荐中的偏见问题 large language model
10 Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry 构建现代中文诗歌检测基准,评估现有模型在识别LLM生成诗歌上的能力。 large language model
11 Do Retrieval Augmented Language Models Know When They Don't Know? 研究检索增强语言模型(RALM)的拒答能力,并提出改进方案以平衡拒答与正确回答。 large language model
12 Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA 提出Reason-KE框架,通过显式推理链实现LLM在多跳QA中鲁棒的知识编辑 large language model
13 LLMs cannot spot math errors, even when allowed to peek into the solution LLM难以发现数学解题步骤中的错误,即使允许查看参考答案 large language model
14 LongCat-Flash Technical Report LongCat-Flash:一个具有高效计算和高级Agent能力的5600亿参数MoE语言模型 foundation model
15 Natural Context Drift Undermines the Natural Language Understanding of Large Language Models 提出框架分析自然文本演变对LLM问答能力的影响 large language model
16 Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition 提出序列化输出提示以提升多说话者语音识别性能 large language model
17 Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation 评估大型语言模型在伊斯兰继承法推理中的表现 large language model
18 REFRAG: Rethinking RAG based Decoding 提出REFRAG以解决RAG解码效率问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
19 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic 提出基于任务算术的推理向量迁移方法,提升大语言模型的推理能力。 reinforcement learning large language model chain-of-thought
20 Enhancing Uncertainty Estimation in LLMs with Expectation of Aggregated Internal Belief EAGLE:利用LLM内部信念聚合期望提升不确定性估计 reinforcement learning RLHF large language model
21 We Politely Insist: Your LLM Must Learn the Persian Art of Taarof 提出TaarofBench以解决大语言模型的文化理解问题 direct preference optimization large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页