cs.CL(2025-10-14)

📊 共 50 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (40 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱四:生成式动作 (Generative Motion) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (40 篇)

#题目一句话要点标签🔗
1 SafeMT: Multi-turn Safety for Multimodal Language Models 提出SafeMT基准,评估多模态大语言模型在多轮对话中的安全性,并提出对话安全调节器 large language model multimodal
2 Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models 提出多阶段提示精炼(MPR)框架,缓解大语言模型中的幻觉问题。 large language model
3 CPR: Mitigating Large Language Model Hallucinations with Curative Prompt Refinement 提出CPR框架,通过优化提示词缓解大语言模型幻觉问题 large language model
4 From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug Repurposing LLaDR:利用大语言模型辅助生物医学概念表示,用于药物重定向 large language model
5 Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models 提出Credal Transformer,通过不确定性建模缓解大语言模型的幻觉问题 large language model
6 A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness 综述小型与大型语言模型协同,提升性能、降低成本、保障隐私与可信性。 large language model
7 Deep Associations, High Creativity: A Simple yet Effective Metric for Evaluating Large Language Models 提出PACE:一种简单高效的LLM创造力评估指标,避免数据污染且与人类评估高度相关 large language model
8 Investigating Political and Demographic Associations in Large Language Models Through Moral Foundations Theory 通过道德基础理论探究大型语言模型中的政治和人口统计关联性 large language model
9 COSTAR-A: A prompting framework for enhancing Large Language Model performance on Point-of-View questions COSTAR-A框架通过优化Prompt提升小模型在视角问题上的性能 large language model
10 Community size rather than grammatical complexity better predicts Large Language Model accuracy in a novel Wug Test Wug测试揭示:语言模型准确率受社群规模而非语法复杂度主导 large language model
11 Too Open for Opinion? Embracing Open-Endedness in Large Language Models for Social Simulation 探索LLM在社会模拟中的开放性:提升测量、减少偏差、增强方法效用 large language model
12 Uncertainty Quantification for Hallucination Detection in Large Language Models: Foundations, Methodology, and Future Directions 综述性研究:面向大语言模型幻觉检测的不确定性量化方法 large language model
13 Schema for In-Context Learning 提出SA-ICL框架,通过显式schema激活提升LLM的上下文学习能力 large language model chain-of-thought
14 Not in Sync: Unveiling Temporal Bias in Audio Chat Models 揭示音频聊天模型中的时间偏差,提出TBI指标进行量化 multimodal TAMP
15 A Survey on Parallel Reasoning 综述并行推理:提升大语言模型鲁棒性的新兴推理范式 large language model chain-of-thought
16 Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception 提出Omni-Captioner,用于多模态细粒度感知,并构建相应的数据集、模型和评测基准。 multimodal
17 Toward LLM-Supported Automated Assessment of Critical Thinking Subskills 利用大语言模型自动评估学生批判性思维子技能 large language model
18 Dr.LLM: Dynamic Layer Routing in LLMs Dr.LLM:通过动态层路由提升大语言模型推理效率与精度 large language model
19 Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences 窄域微调在LLM激活差异中留下可读痕迹,可用于理解微调领域。 large language model
20 StyleDecipher: Robust and Explainable Detection of LLM-Generated Texts with Stylistic Analysis StyleDecipher:利用文体分析实现对LLM生成文本的鲁棒且可解释的检测 large language model
21 When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection 提出个性化文本检测基准以解决机器生成文本的识别问题 large language model
22 Guarding the Guardrails: A Taxonomy-Driven Approach to Jailbreak Detection 提出基于分类法的越狱攻击检测方法,提升大语言模型安全性 large language model
23 Tokenization Disparities as Infrastructure Bias: How Subword Systems Create Inequities in LLM Access and Efficiency 揭示分词差异中的基础设施偏差:子词系统如何造成LLM访问和效率的不平等 large language model
24 LLM-REVal: Can We Trust LLM Reviewers Yet? LLM-REVal:评估LLM作为评审者的可靠性,揭示其偏见与潜在风险 large language model
25 Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability 通过机制可解释性分析微调LLM中的道德偏见,并提出缓解方法 large language model
26 An AI-Based Behavioral Health Safety Filter and Dataset for Identifying Mental Health Crises in Text-Based Conversations 提出基于AI的行为健康安全过滤器及数据集,用于识别文本对话中的精神健康危机。 large language model
27 Interpreting the Latent Structure of Operator Precedence in Language Models 研究LLM内部如何编码算术运算优先级,揭示中间计算过程。 large language model
28 LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization 提出Prompt Duel Optimizer (PDO),高效解决无标签条件下的LLM提示优化问题 large language model
29 OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning OPLoRA:正交投影LoRA防止参数高效微调中的灾难性遗忘 large language model
30 A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation 大规模多语种研究揭示LLM安全防护、个性化与虚假信息传播的复杂关系 large language model
31 3-Model Speculative Decoding 提出金字塔推测解码,通过引入中间模型提升大语言模型推理速度。 large language model
32 The Curious Case of Curiosity across Human Cultures and LLMs 提出CUEST框架,揭示LLM在跨文化好奇心表达上的偏差并提出优化方案。 large language model
33 RAID: Refusal-Aware and Integrated Decoding for Jailbreaking LLMs 提出RAID框架以解决大型语言模型的越狱攻击问题 large language model
34 Attribution Quality in AI-Generated Content:Benchmarking Style Embeddings and LLM Judges 对比风格嵌入和LLM判别器,评估AI生成内容归属质量并构建基准。 large language model
35 Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation 提出CLEAR框架,通过探测潜在知识冲突提升RAG系统的忠实性 large language model
36 Fine-grained Analysis of Brain-LLM Alignment through Input Attribution 提出细粒度输入归因方法,深入分析大脑与LLM对齐关系 large language model
37 A large-scale, unsupervised pipeline for automatic corpus annotation using LLMs: variation and change in the English consider construction 提出基于LLM的大规模无监督语料自动标注流程,加速语料库语言学研究。 large language model
38 The Harder The Better: Maintaining Supervised Fine-tuning Generalization with Less but Harder Data 提出THTB框架,通过更少但更难的数据维持监督微调的泛化能力 large language model
39 Towards Inference-time Scaling for Continuous Space Reasoning 探索推理时缩放技术在连续空间推理中的应用与挑战 large language model
40 Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM 对比神经符号与LLM方法,评估农业领域对话信息抽取的性能与成本。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
41 DPO-Tuned Large Language Models for Segmentation in Simultaneous Speech Translation 提出基于DPO微调的大语言模型,用于提升同声传译中的语音分割质量 DPO direct preference optimization large language model
42 Hierarchical Alignment: Surgical Fine-Tuning via Functional Layer Specialization in Large Language Models 提出层级对齐方法,通过功能层特化微调大型语言模型,提升性能。 DPO direct preference optimization large language model
43 SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression 提出SMEC框架,用于检索嵌入压缩,在保持性能的同时显著降低维度。 representation learning large language model multimodal
44 EduDial: Constructing a Large-scale Multi-turn Teacher-Student Dialogue Corpus 构建大规模师生对话语料库EduDial,提升LLM在教育场景下的教学能力 teacher-student large language model
45 Reliable Fine-Grained Evaluation of Natural Language Math Proofs 提出ProofGrader,用于可靠地评估LLM生成的自然语言数学证明的质量。 MAE IMoS large language model
46 On the Role of Preference Variance in Preference Optimization 提出偏好方差优化方法以提升人类偏好学习效率 DPO direct preference optimization large language model
47 Improving Text-to-Image Generation with Input-Side Inference-Time Scaling 提出一种基于LLM的提示重写框架,提升文本到图像生成效果,尤其针对欠指定提示。 DPO direct preference optimization large language model
48 Reasoning Pattern Matters: Learning to Reason without Human Rationales 提出PARO框架,利用LLM自动生成符合推理模式的标注,降低SFT+RLVR范式对人工标注的依赖。 reinforcement learning large language model

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
49 HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities HackWorld:评估计算机使用Agent在利用Web应用漏洞方面的能力 penetration

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
50 VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-Resource Languages VLURes:提出多语言视觉语言理解基准,评估低资源语言环境下VLM的性能。 scene understanding

⬅️ 返回 cs.CL 首页 · 🏠 返回主页