cs.CL(2025-03-10)

📊 共 42 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (33 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (7) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (33 篇)

#题目一句话要点标签🔗
1 XIFBench: Evaluating Large Language Models on Multilingual Instruction Following XIFBench:一个用于评估大语言模型多语言指令遵循能力的综合基准 large language model instruction following
2 A Novel Ophthalmic Benchmark for Evaluating Multimodal Large Language Models with Fundus Photographs and OCT Images 提出眼科多模态大语言模型评测基准,评估眼底彩照和OCT图像分析能力 large language model multimodal
3 Exploring Multimodal Perception in Large Language Models Through Perceptual Strength Ratings 通过感知强度评估探索大型语言模型中的多模态感知能力 large language model multimodal
4 Application of Multiple Chain-of-Thought in Contrastive Reasoning for Implicit Sentiment Analysis 提出双重/三重反向链式推理框架,用于提升隐式情感分析性能 large language model chain-of-thought
5 Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation 构建医学影像质控数据集与评估框架,探索大语言模型在质控中的应用 large language model multimodal
6 cantnlp@DravidianLangTech2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection 提出基于声音包方法的印地语多模态仇恨言论检测系统,探索语音数据在仇恨言论识别中的潜力。 multimodal
7 SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models 提出SEAP:一种免训练的稀疏专家激活剪枝方法,释放大语言模型潜力 large language model
8 Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models 评估随机种子对微调大型语言模型宏观和微观层面的影响 large language model
9 TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine TCM-3CEval:构建中医大语言模型三轴评估基准,弥合临床需求差距 large language model
10 Large Language Models Often Say One Thing and Do Another 提出WDCT基准,揭示大语言模型“言行不一”问题,并探究对齐策略的影响。 large language model
11 Bot Wars Evolved: Orchestrating Competing LLMs in a Counterstrike Against Phone Scams 提出Bot Wars框架,利用LLM对抗电话诈骗,实现策略涌现 large language model chain-of-thought
12 Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data 针对报告摘要任务,研究有监督和无监督数据下微调LLM的有效性 large language model
13 Can Memory-Augmented Language Models Generalize on Reasoning-in-a-Haystack Tasks? 提出MemReasoner,增强LLM在复杂推理任务中的泛化能力 large language model
14 Gemini Embedding: Generalizable Embeddings from Gemini Gemini Embedding:利用Gemini大模型生成通用文本嵌入,显著提升多语言和多模态文本表示能力 large language model
15 Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality 通过重复利用高质量过滤数据集,提升大语言模型在有限计算资源下的性能 large language model
16 HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations 提出HalluVerse25:一个用于评估LLM幻觉的细粒度多语言基准数据集。 large language model
17 Implicit Reasoning in Transformers is Reasoning through Shortcuts Transformer中的隐式推理本质是基于shortcut的学习 large language model
18 KSOD: Knowledge Supplement for LLMs On Demand 提出KSOD框架,按需为LLM补充知识以提升领域任务性能。 large language model
19 ZeroSumEval: An Extensible Framework For Scaling LLM Evaluation with Inter-Model Competition 提出ZeroSumEval,通过模型间竞争扩展LLM评估框架 large language model
20 TokenButler: Token Importance is Predictable TokenButler:提出一种可预测Token重要性的方法,缓解LLM KV-Cache瓶颈。 large language model
21 Language Models Fail to Introspect About Their Knowledge of Language 研究表明大型语言模型无法有效内省其语言知识 large language model
22 Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations 通过亚洲国家人口统计分析量化开放LLM中的宗教偏见 large language model
23 LLMs syntactically adapt their language use to their conversational partner 研究表明大型语言模型在对话中会进行句法层面的语言风格调整以适应对话伙伴。 large language model
24 Revisiting Noise in Natural Language Processing for Computational Social Science 重新审视自然语言处理中的噪声,以促进计算社会科学研究。 large language model
25 A Graph-based Verification Framework for Fact-Checking 提出GraphFC框架,通过图结构化验证解决虚假信息检测中分解不足和指代歧义问题 large language model
26 MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark 提出MRCEval,一个全面、有挑战性且易于访问的机器阅读理解评测基准。 large language model
27 Linguistic Knowledge Transfer Learning for Speech Enhancement 提出跨模态知识迁移框架CMKT,利用预训练LLM提升语音增强效果。 large language model
28 Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words 提出Identity Lock机制,通过身份唤醒词锁定API微调LLM,防止密钥泄露。 large language model
29 DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation DatawiseAgent:面向数据科学自动化,基于Notebook的自适应鲁棒LLM Agent框架 large language model
30 Toward Multi-Session Personalized Conversation: A Large-Scale Dataset and Hierarchical Tree Framework for Implicit Reasoning 提出ImplexConv数据集和TaciTree框架,用于解决多轮个性化对话中的隐式推理问题。 large language model
31 Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations 提出Bias Benchmark for Generation (BBG),用于评估长文本生成中大型语言模型的社会偏见。 large language model
32 Effect of Selection Format on LLM Performance 研究选择格式对大语言模型性能的影响,发现项目符号格式通常更优 large language model
33 Enhanced Multi-Tuple Extraction for Alloys: Integrating Pointer Networks and Augmented Attention 提出融合指针网络与增强注意力机制的多元组提取框架,用于合金材料文献信息抽取。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
34 Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models 评估奖励模型中的群体公平性,揭示现有模型在不同人群上的显著不公平现象 RLHF large language model
35 Detection Avoidance Techniques for Large Language Models 针对大型语言模型检测器的规避技术研究与性能分析 reinforcement learning large language model
36 DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs 提出DistiLLM-2以提升大语言模型蒸馏效果 distillation large language model instruction following
37 Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation Magnet:通过图翻译合成和提炼多轮工具使用数据,提升LLM函数调用能力 distillation large language model
38 UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality 提出UC-MOA框架,通过效用函数调节实现LLM在多目标对齐上的分布帕累托最优 reinforcement learning RLHF large language model
39 LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL LMM-R1:通过双阶段规则强化学习增强3B LMM的推理能力 reinforcement learning multimodal
40 LexPro-1.0 Technical Report LexPro-1.0:面向中国法律领域的高精度推理大语言模型 reinforcement learning large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
41 MapQA: Open-domain Geospatial Question Answering on Map Data 提出MapQA数据集,用于开放域地图数据的地理空间问答任务 spatial relationship large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
42 CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language Generation CtrlRAG:基于掩码语言模型的RAG黑盒对抗攻击方法 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页