cs.CL(2025-04-16)

📊 共 26 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (20 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (20 篇)

#题目一句话要点标签🔗
1 What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory Structure 探讨大语言模型是否具备内隐知识,并将其作为因果解释结构 large language model
2 Multilingual Contextualization of Large Language Models for Document-Level Machine Translation 提出DocBlocks并通过多范式微调,提升LLM在文档级机器翻译中的性能。 large language model
3 Large Language Models as Quasi-crystals: Coherence Without Repetition in Generative Text 将大语言模型类比为准晶:在生成文本中实现无重复的连贯性 large language model
4 Waking Up an AI: A Quantitative Framework for Prompt-Induced Phase Transition in Large Language Models 提出量化框架,研究提示词诱导大语言模型认知相变现象 large language model
5 Replicating ReLM Results: Validating Large Language Models with ReLM 使用形式语言ReLM验证大型语言模型的记忆、偏见和零样本性能 large language model
6 Leveraging Large Language Models for Multi-Class and Multi-Label Detection of Drug Use and Overdose Symptoms on Social Media 利用大型语言模型进行社交媒体上药物滥用和过量症状的多类别和多标签检测 large language model
7 An LLM-as-a-judge Approach for Scalable Gender-Neutral Translation Evaluation 提出基于LLM的性别中立翻译评估方法,提升评估准确性和可扩展性 large language model chain-of-thought
8 FiSMiness: A Finite State Machine Based Paradigm for Emotional Support Conversations 提出基于有限状态机的FiSMiness框架,提升情感支持对话的长期效果。 large language model chain-of-thought
9 Memorization vs. Reasoning: Updating LLMs with New Knowledge 提出KUP基准与MCT训练方法,提升LLM对新知识的记忆与推理能力 large language model
10 A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment 研究LLM提示词敏感性对信息检索相关性判断的影响,并提供数据集。 large language model
11 BitNet b1.58 2B4T Technical Report BitNet b1.58:首个开源20亿参数规模的1-bit大语言模型,兼顾性能与效率。 large language model
12 Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation 提出基于熵引导的水印方案,提升LLM文本生成的可追溯性和鲁棒性 large language model
13 Gauging Overprecision in LLMs: An Empirical Study 提出评估LLM过度精确性的框架,揭示其在数值任务中的不确定性校准问题 large language model
14 SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes Mu-SHROOM:多语言LLM幻觉检测共享任务,聚焦可观察的过度生成错误。 large language model
15 Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection 提出FlawedFictions基准,用于评估语言模型在故事情节漏洞检测中的复杂推理能力。 large language model
16 Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel Integration 提出Query-to-Recommendation框架,解决LLM推荐系统中的偏差和串行瓶颈问题 large language model
17 Could Thinking Multilingually Empower LLM Reasoning? 利用多语言推理提升大语言模型在复杂任务中的性能上限 large language model
18 Efficient and Adaptive Simultaneous Speech Translation with Fully Unidirectional Architecture 提出EASiST,一种全单向架构的高效自适应同步语音翻译模型。 large language model
19 WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms WebRollback:通过显式回滚机制增强Web代理的导航能力 large language model
20 Deep Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions 提出基于叙事身份的LLM虚拟角色构建方法,用于模拟政治倾向认知偏差。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
21 d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning 提出d1框架,通过强化学习提升扩散大语言模型在推理任务上的性能。 reinforcement learning large language model
22 Evaluating the Diversity and Quality of LLM Generated Content 提出有效语义多样性评估框架,揭示偏好调整模型在高质量内容生成中的优势 reinforcement learning PPO RLHF
23 SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data SALAD:利用结构感知和LLM驱动的对比学习提升鲁棒性和泛化性 contrastive learning large language model
24 Integrating Structural and Semantic Signals in Text-Attributed Graphs with BiGTex BiGTex:通过双向图文融合单元,整合文本属性图中的结构和语义信息。 representation learning mutual attention large language model
25 Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification 揭示ChatGPT在情感分类中对提示语微调的敏感性,挑战其可靠性。 predictive model large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
26 SLURG: Investigating the Feasibility of Generating Synthetic Online Fallacious Discourse SLURG:探索利用大型语言模型生成合成在线谬误言论的可行性 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页