cs.CL(2025-12-18)

📊 共 20 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (4)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image 提出Multimodal RewardBench 2 (MMRB2),用于评估多模态奖励模型在图文交错场景下的性能。 large language model multimodal
2 A Women's Health Benchmark for Large Language Models 提出女性健康基准测试(WHB),评估大型语言模型在女性健康领域的可靠性。 large language model
3 Jailbreak-Zero: A Path to Pareto Optimal Red Teaming for Large Language Models Jailbreak-Zero:一种面向大语言模型红队测试的帕累托最优方法 large language model
4 Benchmarking and Adapting On-Device Large Language Models for Clinical Decision Support 针对临床决策支持,评估并优化端侧大语言模型,实现隐私保护和高效部署。 large language model
5 An Information-Theoretic Framework for Robust Large Language Model Editing 提出基于信息瓶颈的IBKE框架,用于稳健的大语言模型知识编辑。 large language model
6 DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack 提出DualGuard以解决大语言模型水印防御问题 large language model
7 Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs 首个SpeechLLM综合评测:对比端到端与级联架构语音翻译性能 large language model foundation model
8 Perturb Your Data: Paraphrase-Guided Training Data Watermarking SPECTRA:一种基于释义引导的训练数据水印方法,用于检测LLM训练数据来源。 large language model
9 When F1 Fails: Granularity-Aware Evaluation for Dialogue Topic Segmentation 提出一种细粒度对话主题分割评估框架,解决传统F1指标的局限性。 large language model
10 From Facts to Conclusions : Integrating Deductive Reasoning in Retrieval-Augmented LLMs 提出推理追踪增强的RAG框架,解决检索信息冲突和主观性问题。 large language model
11 Refusal Steering: Fine-grained Control over LLM Refusal Behaviour for Sensitive Topics 提出Refusal Steering,实现对LLM在敏感话题上拒绝行为的细粒度控制 large language model
12 From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection 提出SemMark:一种自适应语义感知水印方法,用于保护Embedding-as-a-Service的版权 large language model
13 Evaluating OpenAI GPT Models for Translation of Endangered Uralic Languages: A Comparison of Reasoning and Non-Reasoning Architectures 评估OpenAI GPT模型在濒危乌拉尔语翻译中的性能,对比推理与非推理架构。 large language model
14 Sigma-MoE-Tiny Technical Report 提出Sigma-MoE-Tiny,一种高稀疏MoE语言模型,解决专家负载均衡难题。 foundation model
15 LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding LoPA:通过前瞻并行解码加速扩散大语言模型推理 large language model
16 ContextLeak: Auditing Leakage in Private In-Context Learning Methods ContextLeak:首个针对私有上下文学习方法泄露审计框架 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
17 Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL 提出Struct-SQL框架,利用结构化CoT蒸馏提升Text-to-SQL小模型的性能。 distillation large language model chain-of-thought
18 AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning 提出AdaSearch,通过强化学习平衡大语言模型的参数知识与外部搜索 reinforcement learning large language model
19 JustRL: Scaling a 1.5B LLM with a Simple RL Recipe JustRL:通过简单强化学习方法扩展15亿参数大语言模型,实现卓越推理性能。 reinforcement learning curriculum learning large language model
20 Emergent World Beliefs: Exploring Transformers in Stochastic Games 探索Transformer在随机博弈中的涌现世界信念:以扑克为例 world model large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页