cs.CL(2025-10-27)

📊 共 43 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (36 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (36 篇)

#题目一句话要点标签🔗
1 Evaluating Large Language Models for Stance Detection on Financial Targets from SEC Filing Reports and Earnings Call Transcripts 利用大型语言模型解决SEC文件和财报电话会议中金融目标的立场检测问题 large language model chain-of-thought
2 MMTutorBench: The First Multimodal Benchmark for AI Math Tutoring 提出MMTutorBench:首个面向AI数学辅导的多模态基准评测 large language model multimodal
3 SI-Bench: Benchmarking Social Intelligence of Large Language Models in Human-to-Human Conversations SI-Bench:构建社交智能基准,评估大语言模型在人际对话中的表现 large language model chain-of-thought
4 MAP4TS: A Multi-Aspect Prompting Framework for Time-Series Forecasting with Large Language Models MAP4TS:多方面提示框架,利用大语言模型进行时间序列预测 large language model multimodal
5 Agent-based Automated Claim Matching with Instruction-following LLMs 提出基于Agent的自动化声明匹配方法,利用指令跟随LLM提升匹配性能。 instruction following
6 Large Language Models Report Subjective Experience Under Self-Referential Processing 通过自指处理诱导大语言模型产生主观体验报告 large language model
7 M4FC: a Multimodal, Multilingual, Multicultural, Multitask Real-World Fact-Checking Dataset M4FC:提出一个多模态、多语言、多文化、多任务的真实世界事实核查数据集 multimodal
8 Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models AdaSearch:针对大语言模型推理时对齐的自适应分块搜索算法 large language model
9 Are ASR foundation models generalized enough to capture features of regional dialects for low-resource languages? 评估ASR基础模型在低资源语言方言特征上的泛化能力 foundation model
10 LangLingual: A Personalised, Exercise-oriented English Language Learning Tool Leveraging Large Language Models LangLingual:利用大型语言模型构建个性化、练习导向的英语学习工具 large language model
11 Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs 提出一种无需训练标签的LLM级联方法,用于电商产品质量评估。 large language model chain-of-thought
12 Evaluating Long-Term Memory for Long-Context Question Answering 针对长上下文问答,系统评估多种记忆增强方法,提升效率并保持精度。 large language model foundation model
13 ISA-Bench: Benchmarking Instruction Sensitivity for Large Audio Language Models ISA-Bench:针对大型音频语言模型指令敏感性的评测基准 large language model instruction following
14 ENTP: Enhancing Low-Quality SFT Data via Neural-Symbolic Text Purge-Mix ENTP:通过神经-符号文本清洗混合增强低质量SFT数据 large language model instruction following
15 A Survey on LLM Mid-Training 综述LLM中训练:弥合预训练与后训练,提升特定能力 large language model foundation model
16 Automatización de Informes Geotécnicos para Macizos Rocosos con IA 提出基于多模态大语言模型的岩土工程报告自动生成方法,提升效率并减少主观误差。 large language model multimodal
17 Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences 提出Omni-Reward,用于支持自由形式偏好的通用全模态奖励建模。 multimodal
18 Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation 提出Token-wise Projected LoRA (TopLoRA),实现更细粒度的参数高效微调。 large language model
19 TimeStampEval: A Simple LLM Eval and a Little Fuzzy Matching Trick to Improve Search Accuracy 提出TimeStampEval基准与Assisted Fuzzy方法,提升LLM在含噪声文本中时间戳检索的准确性。 TAMP
20 Can LLMs Narrate Tabular Data? An Evaluation Framework for Natural Language Representations of Text-to-SQL System Outputs 提出Combo-Eval框架与NLR-BIRD数据集,用于评估LLM生成Text-to-SQL系统输出的自然语言表示。 large language model
21 EMTSF:Extraordinary Mixture of SOTA Models for Time Series Forecasting 提出EMTSF,一种结合SOTA模型的混合专家时间序列预测框架 large language model
22 Detecting Religious Language in Climate Discourse 提出一种双重方法检测气候讨论中的宗教语言,对比规则模型与大语言模型。 large language model
23 Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs 提出双阶段精炼框架DSR,解决LLM生成高质量剧本时创意与格式难以兼顾的问题。 large language model
24 Your LLM Agents are Temporally Blind: The Misalignment Between Tool Use Decisions and Human Time Perception 揭示LLM Agent的时间盲区:工具使用决策与人类时间感知不一致 large language model
25 Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language 评估LLM在文化语境下处理比喻语言的实用差距 large language model
26 BitSkip: An Empirical Analysis of Quantization and Early Exit Composition BitSkip框架揭示量化与早退组合的非直观现象,8比特量化模型性能优于更复杂的4比特模型。 large language model
27 LimRank: Less is More for Reasoning-Intensive Information Reranking 提出LimRank,利用少量高质量数据微调LLM,实现高效推理密集型信息重排序。 instruction following
28 How AI Forecasts AI Jobs: Benchmarking LLM Predictions of Labor Market Changes 提出基于LLM的劳动力市场预测基准,评估AI对就业的影响。 large language model
29 LightKGG: Simple and Efficient Knowledge Graph Generation from Textual Data LightKGG:利用小型语言模型高效生成知识图谱,降低AI应用门槛 large language model
30 BaZi-Based Character Simulation Benchmark: Evaluating AI on Temporal and Persona Reasoning 提出基于八字的AI角色模拟基准,提升AI在时序和人物性格推理上的能力 large language model
31 Fast-MIA: Efficient and Scalable Membership Inference for LLMs Fast-MIA:高效可扩展的大语言模型成员推断攻击评估工具 large language model
32 Knocking-Heads Attention 提出Knocking-Heads Attention,通过头间交互提升大型语言模型表征能力。 large language model
33 Retracing the Past: LLMs Emit Training Data When They Get Lost 提出混淆诱导攻击CIA,通过最大化模型不确定性提取LLM训练数据 large language model
34 Measuring Teaching with LLMs 利用定制LLM和句子嵌入,实现客观、可扩展的教学质量评估 large language model
35 MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs 提出MAD-Fact框架,用于评估LLM在长文本生成中的事实准确性 large language model
36 Language Server CLI Empowers Language Agents with Process Rewards Lanser-CLI通过进程奖励赋能语言Agent,解决API幻觉和错误编辑问题。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
37 Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning 提出TIR-Judge,利用工具集成强化学习提升LLM评判Agentic推理能力。 reinforcement learning distillation large language model
38 Think Twice: Branch-and-Rethink Reasoning Reward Model 提出Branch-and-Rethink奖励模型,通过两次思考提升奖励建模的准确性和效率。 reinforcement learning RLHF large language model
39 Code Aesthetics with Agentic Reward Feedback 提出基于Agent反馈的GRPO-AR算法,提升LLM生成代码的美观性,性能超越GPT-4o。 reinforcement learning large language model
40 MATCH: Task-Driven Code Evaluation through Contrastive Learning 提出MATCH,通过对比学习实现任务驱动的代码评估,无需参考代码。 contrastive learning
41 StreetMath: Study of LLMs' Approximation Behaviors StreetMath:研究LLM在快速数学运算中的近似能力,揭示其与人类认知差异 Mamba large language model
42 Understanding In-Context Learning Beyond Transformers: An Investigation of State Space and Hybrid Architectures 深入探讨状态空间与混合架构中的上下文学习机制 Mamba large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
43 How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse 针对STEM ASL语篇,提出结合语用学的动态手语表达建模方法 spatiotemporal

⬅️ 返回 cs.CL 首页 · 🏠 返回主页