cs.CL（2025-08-28）

📊 共 34 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (29 🔗5) 支柱二：RL算法与架构 (RL & Architecture) (5 🔗2)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (29 篇)

#	题目	一句话要点	标签	🔗
1	The Percept-V Challenge: Can Multimodal LLMs Crack Simple Perception Problems?	提出Percept-V数据集，评估多模态大语言模型在基础视觉感知任务上的能力	large language model multimodal
2	A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers	综述科学大语言模型：从数据基础到智能体前沿	large language model multimodal
3	Exploring Machine Learning and Language Models for Multimodal Depression Detection	探索机器学习与语言模型在多模态抑郁症检测中的应用	large language model multimodal
4	How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations	研究表明大型语言模型在价格谈判中受锚定效应影响	large language model chain-of-thought
5	Leveraging Large Language Models for Generating Research Topic Ontologies: A Multi-Disciplinary Study	利用大型语言模型生成研究主题本体，解决跨学科知识组织难题。	large language model chain-of-thought
6	Quantifying Label-Induced Bias in Large Language Model Self- and Cross-Evaluations	揭示大语言模型评估中标签诱导的偏见，强调盲评的重要性	large language model
7	Lethe: Purifying Backdoored Large Language Models with Knowledge Dilution	LETHE：利用知识稀释净化后门大语言模型	large language model
8	GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction	提出GDLLM，利用全局距离感知建模提升大语言模型在事件时序关系抽取中的性能	large language model
9	Addressing Tokenization Inconsistency in Steganography and Watermarking Based on Large Language Models	针对LLM隐写与水印中Token化不一致问题，提出阶梯验证与回滚方法	large language model
10	ConspirED: A Dataset for Cognitive Traits of Conspiracy Theories and Large Language Model Safety	ConspirED：构建阴谋论认知特征数据集，评估大型语言模型安全性	large language model
11	CAPE: Context-Aware Personality Evaluation Framework for Large Language Models	CAPE：提出上下文感知的LLM人格评估框架，解决现有方法忽略对话历史的问题。	large language model	✅
12	Benchmarking GPT-5 for biomedical natural language processing	评估GPT-5在生物医学自然语言处理任务中的性能，揭示其优势与局限。	multimodal chain-of-thought
13	A Graph Talks, But Who's Listening? Rethinking Evaluations for Graph-Language Models	揭示图语言模型评估困境：现有基准不足以评估多模态推理能力	large language model multimodal
14	GUARD: Guideline Upholding Test through Adaptive Role-play and Jailbreak Diagnostics for LLMs	GUARD：通过自适应角色扮演和越狱诊断提升LLM的合规性测试	large language model
15	On the Theoretical Limitations of Embedding-Based Retrieval	揭示基于嵌入检索的理论局限性：即使简单查询也可能失效	instruction following
16	Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection	提出Rank-One Safety Injection (ROSI)，通过秩一权重修改增强LLM安全性对齐。	large language model
17	Decoding Memories: An Efficient Pipeline for Self-Consistency Hallucination Detection	提出解码记忆流水线DMP，加速自洽性幻觉检测并降低计算成本	large language model
18	BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design	提出BED-LLM以提升大语言模型的信息收集能力	large language model
19	ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents	ProactiveEval：用于评估主动对话Agent的统一评估框架	large language model
20	CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection	提出CoCoNUTS基准和CoCoDet检测器，用于识别同行评审中AI生成的内容，关注内容而非风格。	large language model	✅
21	Measuring Reasoning Utility in LLMs via Conditional Entropy Reduction	通过条件熵降低评估LLM推理效用，优化推理过程	large language model
22	An Agile Method for Implementing Retrieval Augmented Generation Tools in Industrial SMEs	EASI-RAG：一种敏捷方法，用于在工业中小企业中部署检索增强生成工具	large language model
23	How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on $τ$-bench	提出IRMA框架，通过输入重构显著提升LLM在动态环境中工具使用的准确性	large language model
24	Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT Sessions	对比分析真实与LLM生成的CBT对话情感弧，揭示LLM在情感表达上的局限性	large language model	✅
25	SciTopic: Enhancing Topic Discovery in Scientific Literature through Advanced LLM	SciTopic：利用大型语言模型增强科学文献主题发现，提升科研信息检索效率。	large language model
26	From Post To Personality: Harnessing LLMs for MBTI Prediction in Social Media	提出PostToPersonality框架，利用LLM进行社交媒体MBTI性格预测，缓解幻觉并解决数据不平衡问题	large language model
27	MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers	MCP-Bench：通过MCP服务器评估LLM智能体在复杂真实世界任务中的工具使用能力	large language model	✅
28	CAMB: A comprehensive industrial LLM benchmark on civil aviation maintenance	提出CAMB：一个全面的民用航空维护工业LLM基准测试	large language model	✅
29	Joint Enhancement of Relational Reasoning for Long-Context LLMs	提出JERR框架，通过图推理增强长文本LLM的关系推理能力	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (5 篇)

#	题目	一句话要点	标签	🔗
30	SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement	SageLM：用于语音评判的多方面可解释大型语言模型	reinforcement learning large language model
31	Prediction of mortality and resource utilization in critical care: a deep learning approach using multimodal electronic health records with natural language processing techniques	提出一种基于多模态EHR和NLP的深度学习框架，用于预测重症监护中的死亡率和资源利用。	MAE multimodal
32	Graph-R1: Unleashing LLM Reasoning with NP-Hard Graph Problems	Graph-R1：利用NP-hard图问题提升LLM的推理能力	reinforcement learning reward design large language model	✅
33	Improving Aviation Safety Analysis: Automated HFACS Classification Using Reinforcement Learning with Group Relative Policy Optimization	提出基于强化学习的HFACS自动分类框架，提升航空安全分析效率与准确性	reinforcement learning large language model
34	Adaptive Federated Distillation for Multi-Domain Non-IID Textual Data	提出自适应联邦蒸馏框架AdaFD，解决多领域非独立同分布文本数据的挑战。	distillation	✅

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2025-08-28）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (29 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (5 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理