cs.CL（2025-09-04）

📊 共 27 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (26 🔗9) 支柱二：RL算法与架构 (RL & Architecture) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (26 篇)

#	题目	一句话要点	标签	🔗
1	Sample-efficient Integration of New Modalities into Large Language Models	提出SEMI方法，高效地将新模态集成到大型语言模型中	large language model foundation model multimodal
2	Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models	评估大型语言模型对过时医学知识的记忆能力，揭示其潜在风险	large language model
3	RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models	提出RTQA框架，利用大语言模型和递归思维解决复杂时序知识图谱问答难题	large language model	✅
4	Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition -- Multimodal Fusion, Challenges, and Future Prospects	系统性回顾语音讽刺识别：多模态融合、挑战与未来展望	multimodal
5	SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning	提出SPFT-SQL，通过自博弈微调增强大语言模型在Text-to-SQL解析任务中的性能。	large language model
6	Quantized Large Language Models in Biomedical Natural Language Processing: Evaluation and Recommendation	量化LLM实现生物医学NLP模型轻量化部署，降低75%显存需求。	large language model
7	A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models	综述：大型语言模型推理可信度研究，聚焦CoT技术及其安全性挑战	large language model	✅
8	CANDY: Benchmarking LLMs' Limitations and Assistive Potential in Chinese Misinformation Fact-Checking	CANDY：评估大语言模型在中文虚假信息核查中的局限性与辅助潜力	large language model chain-of-thought	✅
9	Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?	提出Inverse IFEval基准，评估LLM在对抗性指令下的认知灵活性。	large language model instruction following
10	Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning	提出SSMR-Bench：合成乐谱推理问题，提升AI音乐家能力	large language model multimodal
11	Chain or tree? Re-evaluating complex reasoning from the perspective of a matrix of thought	提出矩阵思维（MoT）框架，提升LLM在复杂推理任务中的效率与准确性	large language model chain-of-thought	✅
12	MobileRAG: Enhancing Mobile Agent with Retrieval-Augmented Generation	MobileRAG：提出检索增强生成框架，提升移动Agent在复杂任务中的性能。	large language model	✅
13	ODKE+: Ontology-Guided Open-Domain Knowledge Extraction with LLMs	ODKE+：利用LLM和本体指导的开放域知识抽取系统，实现大规模高精度知识图谱构建。	large language model
14	Cross-Layer Attention Probing for Fine-Grained Hallucination Detection	提出跨层注意力探测(CLAP)技术，用于细粒度地检测大型语言模型中的幻觉现象。	large language model
15	On Robustness and Reliability of Benchmark-Based Evaluation of LLMs	评估LLM在基准测试中对释义的鲁棒性，揭示其泛化能力局限性	large language model
16	VoxRole: A Comprehensive Benchmark for Evaluating Speech-Based Role-Playing Agents	提出VoxRole：用于评估语音角色扮演代理的综合基准	large language model
17	OleSpeech-IV: A Large-Scale Multispeaker and Multilingual Conversational Speech Dataset with Diverse Topics	OleSpeech-IV：一个大规模、多说话人、多语种、主题多样的会话语音数据集	TAMP
18	Why Language Models Hallucinate	揭示语言模型幻觉根源：训练与评估机制偏差导致模型倾向于猜测而非承认不确定性	large language model
19	Explicit and Implicit Data Augmentation for Social Event Detection	提出SED-Aug框架，结合显式文本增强和隐式特征增强，提升社交事件检测性能。	large language model	✅
20	Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue	构建稳定、个性化的词汇配置文件，为对话Agent实现词汇对齐奠定基础	large language model
21	SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment	提出SelfAug，通过自对齐分布缓解RAG中的灾难性遗忘问题	large language model	✅
22	Iti-Validator: A Guardrail Framework for Validating and Correcting LLM-Generated Itineraries	Iti-Validator：用于验证和修正LLM生成行程的保障框架	large language model
23	False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize	揭示基于探针的恶意输入检测方法泛化性不足的根本原因	large language model	✅
24	Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth	Drivel-ology：构建多语言“深度胡说”数据集，挑战LLM的语用理解能力	large language model
25	SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation	提出SiLVERScore，用于语义感知的姿势语言生成评估，显著优于传统指标。	multimodal
26	Evaluating the Robustness of Retrieval-Augmented Generation to Adversarial Evidence in the Health Domain	评估检索增强生成在医疗领域对抗性证据下的鲁棒性	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Breaking to Build: A Threat Model of Prompt-Based Attacks for Securing LLMs	构建安全LLM：提出基于提示攻击的威胁模型	distillation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2025-09-04）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (26 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理