cs.CL(2025-08-27)

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (24 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (24 篇)

#题目一句话要点标签🔗
1 11Plus-Bench: Demystifying Multimodal LLM Spatial Reasoning with Cognitive-Inspired Analysis 提出11Plus-Bench以评估多模态大语言模型的空间推理能力 large language model multimodal
2 Uncertainty-Aware Collaborative System of Large and Small Models for Multimodal Sentiment Analysis 提出不确定性感知协作系统以解决多模态情感分析中的性能与效率问题 large language model multimodal
3 Prompting Strategies for Language Model-Based Item Generation in K-12 Education: Bridging the Gap Between Small and Large Language Models 提出结构化提示策略以提升K-12教育中的题目生成质量 large language model chain-of-thought
4 Survey of Specialized Large Language Model 系统评估专用大型语言模型以解决专业领域应用问题 large language model multimodal
5 MathBuddy: A Multimodal System for Affective Math Tutoring 提出MathBuddy以解决情感状态对数学学习影响的问题 multimodal
6 Dhati+: Fine-tuned Large Language Models for Arabic Subjectivity Evaluation 提出Dhati+以解决阿拉伯语主观性评估数据不足问题 large language model
7 INSEva: A Comprehensive Chinese Benchmark for Large Language Models in Insurance 提出INSEva基准以解决保险领域AI评估不足问题 large language model
8 Geopolitical Parallax: Beyond Walter Lippmann Just After Large Language Models 提出地缘政治视差分析以解决大语言模型偏见问题 large language model
9 Logical Reasoning with Outcome Reward Models for Test-Time Scaling 提出结果奖励模型以提升推理任务中的逻辑推理能力 large language model chain-of-thought
10 Do MLLMs Really Understand the Charts? 提出ChartVRBench以解决多模态大语言模型在图表理解中的不足 large language model multimodal
11 Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis 提出五个印地语LLM评估数据集以解决评估挑战 large language model
12 LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation 提出层融合解码以优化检索增强生成模型的外部知识利用 large language model
13 Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts 提出HAMLET框架以评估大语言模型在长文本中的理解能力 large language model
14 Language Models Identify Ambiguities and Exploit Loopholes 研究大型语言模型识别模糊性与利用漏洞的能力 large language model
15 Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks 提出IMAGINE框架以增强大型语言模型的安全性 large language model
16 AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios 提出AgentCoMa以解决混合常识与数学推理问题 large language model
17 Scalable and consistent few-shot classification of survey responses using text embeddings 提出基于文本嵌入的分类框架以解决开放式调查响应分析问题 large language model
18 T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables 提出T2R-bench以解决工业表格信息报告生成问题 large language model
19 Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval 提出Spotlight Attention以解决LLM生成中的KV缓存效率问题 large language model
20 Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models 提出CSKS框架以解决LLMs对上下文知识敏感度调整问题 large language model
21 Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs 提出Router Lens与CEFT以提升混合专家模型的上下文可信度 large language model
22 ArgCMV: An Argument Summarization Benchmark for the LLM-era 提出ArgCMV数据集以解决现有论点摘要基准不足问题 large language model
23 Functional Consistency of LLM Code Embeddings: A Self-Evolving Data Synthesis Framework for Benchmarking 提出功能一致性框架以提升代码嵌入模型性能 large language model
24 Rule Synergy Analysis using LLMs: State of the Art and Implications 利用LLMs分析规则协同以解决复杂环境中的推理问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
25 Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning 提出Memory-R1以增强大语言模型的记忆管理能力 reinforcement learning PPO large language model
26 AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models 提出AR$^2$框架以增强大语言模型的抽象推理能力 reinforcement learning large language model
27 QuesGenie: Intelligent Multimodal Question Generation 提出多模态问题生成系统以解决教育资源实践材料不足问题 reinforcement learning RLHF multimodal
28 Can Compact Language Models Search Like Agents? Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities 提出蒸馏引导策略优化以提升紧凑语言模型的智能搜索能力 reinforcement learning distillation
29 Alignment with Fill-In-the-Middle for Enhancing Code Generation 提出填充中间对齐方法以提升代码生成能力 DPO direct preference optimization large language model
30 Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning 提出DisarmRAG以解决RAG系统自我纠错能力的挑战 contrastive learning large language model
31 Beyond Shallow Heuristics: Leveraging Human Intuition for Curriculum Learning 利用人类直觉优化课程学习以提升语言模型训练效果 curriculum learning
32 HEAL: A Hypothesis-Based Preference-Aware Analysis Framework 提出HEAL框架以解决偏好优化评估不足问题 preference learning DPO

⬅️ 返回 cs.CL 首页 · 🏠 返回主页