cs.CL(2025-03-21)

📊 共 23 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (18 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (5)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
1 When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts 提出MixCuBe基准,评估多模态大模型在文化混合场景下的文化偏见 large language model multimodal
2 MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering 提出MTBench多模态时间序列基准,用于评估LLM在时序推理和问答中的能力 large language model multimodal
3 Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models 贝叶斯教学提升大语言模型中的概率推理能力 large language model
4 SaudiCulture: A Benchmark for Evaluating Large Language Models Cultural Competence within Saudi Arabia 提出SaudiCulture基准,评估大型语言模型在沙特阿拉伯文化背景下的能力。 large language model
5 SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging SafeMERGE:通过选择性分层模型融合,在微调大语言模型中保持安全性对齐 large language model
6 Automating Adjudication of Cardiovascular Events Using Large Language Models 提出基于大语言模型的框架,自动化心血管事件的临床试验裁决。 large language model
7 Text2Model: Generating dynamic chemical reactor models using large language models (LLMs) Text2Model:利用大型语言模型生成动态化学反应器模型 large language model
8 A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications 针对大语言模型在实际应用中个性化对齐缺失问题,提出全面综述与统一框架。 large language model
9 Judge Anything: MLLM as a Judge Across Any Modality 提出TaskAnything和JudgeAnything基准,评估MLLM在跨模态理解和生成任务中的表现 foundation model multimodal
10 From Text to Talent: A Pipeline for Extracting Insights from Candidate Profiles 提出基于LLM和图相似度的招聘流程,为职位空缺推荐理想候选人 large language model multimodal
11 Language Models May Verbatim Complete Text They Were Not Explicitly Trained On 大型语言模型可能生成未显式训练的文本,挑战现有成员定义 large language model
12 Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer 研究表明语言特定神经元无法有效促进多语言模型的跨语言迁移 large language model
13 Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility 利用人类产出-理解不对称性测试LLM的认知合理性 large language model
14 Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique 提出PANEL:利用自然语言自评判增强LLM推理能力 large language model
15 CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement 提出CASE模型,利用条件感知句子嵌入提升条件语义文本相似度计算。 large language model
16 CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization 提出CoKe:通过关键词链推理实现可定制的细粒度故事评估 chain-of-thought
17 MMCR: Benchmarking Cross-Source Reasoning in Scientific Papers 提出MMCR基准,评估视觉语言模型在科学论文中跨源推理能力 chain-of-thought
18 Interpretable LLM Guardrails via Sparse Representation Steering 提出稀疏表示引导(SRS)框架,实现对LLM行为的细粒度、可解释控制。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
19 Modifying Large Language Model Post-Training for Diverse Creative Writing 提出基于偏差的后训练方法,提升大语言模型在创意写作中的多样性和质量。 DPO direct preference optimization large language model
20 Federated Cross-Domain Click-Through Rate Prediction With Large Language Model Augmentation 提出联邦跨域点击率预测框架以解决隐私保护与数据稀疏问题 contrastive learning large language model
21 Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning Praxis-VLM:通过文本驱动的强化学习实现视觉场景下的决策 reinforcement learning multimodal
22 Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs 提出基于知识蒸馏的意图过滤方法,用于降低LLM在多方对话中的计算成本 distillation large language model
23 FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models FastCuRL:通过阶段式上下文缩放的课程强化学习,高效训练R1类推理模型 reinforcement learning

⬅️ 返回 cs.CL 首页 · 🏠 返回主页