cs.CL（2025-08-27）

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (24 🔗3) 支柱二：RL算法与架构 (RL & Architecture) (8 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

#	题目	一句话要点	标签	🔗
1	11Plus-Bench: Demystifying Multimodal LLM Spatial Reasoning with Cognitive-Inspired Analysis	提出11Plus-Bench以评估多模态大语言模型的空间推理能力	large language model multimodal
2	Uncertainty-Aware Collaborative System of Large and Small Models for Multimodal Sentiment Analysis	提出不确定性感知协作系统以解决多模态情感分析中的性能与效率问题	large language model multimodal
3	Prompting Strategies for Language Model-Based Item Generation in K-12 Education: Bridging the Gap Between Small and Large Language Models	提出结构化提示策略以提升K-12教育中的题目生成质量	large language model chain-of-thought
4	Survey of Specialized Large Language Model	系统评估专用大型语言模型以解决专业领域应用问题	large language model multimodal
5	MathBuddy: A Multimodal System for Affective Math Tutoring	提出MathBuddy以解决情感状态对数学学习影响的问题	multimodal	✅
6	Dhati+: Fine-tuned Large Language Models for Arabic Subjectivity Evaluation	提出Dhati+以解决阿拉伯语主观性评估数据不足问题	large language model
7	INSEva: A Comprehensive Chinese Benchmark for Large Language Models in Insurance	提出INSEva基准以解决保险领域AI评估不足问题	large language model
8	Geopolitical Parallax: Beyond Walter Lippmann Just After Large Language Models	提出地缘政治视差分析以解决大语言模型偏见问题	large language model
9	Logical Reasoning with Outcome Reward Models for Test-Time Scaling	提出结果奖励模型以提升推理任务中的逻辑推理能力	large language model chain-of-thought
10	Do MLLMs Really Understand the Charts?	提出ChartVRBench以解决多模态大语言模型在图表理解中的不足	large language model multimodal
11	Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis	提出五个印地语LLM评估数据集以解决评估挑战	large language model
12	LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation	提出层融合解码以优化检索增强生成模型的外部知识利用	large language model
13	Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts	提出HAMLET框架以评估大语言模型在长文本中的理解能力	large language model	✅
14	Language Models Identify Ambiguities and Exploit Loopholes	研究大型语言模型识别模糊性与利用漏洞的能力	large language model
15	Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks	提出IMAGINE框架以增强大型语言模型的安全性	large language model
16	AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios	提出AgentCoMa以解决混合常识与数学推理问题	large language model
17	Scalable and consistent few-shot classification of survey responses using text embeddings	提出基于文本嵌入的分类框架以解决开放式调查响应分析问题	large language model
18	T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables	提出T2R-bench以解决工业表格信息报告生成问题	large language model
19	Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval	提出Spotlight Attention以解决LLM生成中的KV缓存效率问题	large language model
20	Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models	提出CSKS框架以解决LLMs对上下文知识敏感度调整问题	large language model	✅
21	Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs	提出Router Lens与CEFT以提升混合专家模型的上下文可信度	large language model
22	ArgCMV: An Argument Summarization Benchmark for the LLM-era	提出ArgCMV数据集以解决现有论点摘要基准不足问题	large language model
23	Functional Consistency of LLM Code Embeddings: A Self-Evolving Data Synthesis Framework for Benchmarking	提出功能一致性框架以提升代码嵌入模型性能	large language model
24	Rule Synergy Analysis using LLMs: State of the Art and Implications	利用LLMs分析规则协同以解决复杂环境中的推理问题	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗
25	Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning	提出Memory-R1以增强大语言模型的记忆管理能力	reinforcement learning PPO large language model
26	AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models	提出AR$^2$框架以增强大语言模型的抽象推理能力	reinforcement learning large language model
27	QuesGenie: Intelligent Multimodal Question Generation	提出多模态问题生成系统以解决教育资源实践材料不足问题	reinforcement learning RLHF multimodal
28	Can Compact Language Models Search Like Agents? Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities	提出蒸馏引导策略优化以提升紧凑语言模型的智能搜索能力	reinforcement learning distillation
29	Alignment with Fill-In-the-Middle for Enhancing Code Generation	提出填充中间对齐方法以提升代码生成能力	DPO direct preference optimization large language model	✅
30	Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning	提出DisarmRAG以解决RAG系统自我纠错能力的挑战	contrastive learning large language model
31	Beyond Shallow Heuristics: Leveraging Human Intuition for Curriculum Learning	利用人类直觉优化课程学习以提升语言模型训练效果	curriculum learning
32	HEAL: A Hypothesis-Based Preference-Aware Analysis Framework	提出HEAL框架以解决偏好优化评估不足问题	preference learning DPO

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2025-08-27）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册