cs.CL(2024-05-13)

📊 共 26 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (19 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (19 篇)

#题目一句话要点标签🔗
1 AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments AgentClinic:多模态Agent基准测试,评估AI在模拟临床环境中的表现 large language model multimodal
2 Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp 揭示CLIP过滤的数据偏差:DataComp数据集的多模态分析与公平性评估 multimodal
3 Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Plot2Code:一个综合性的基准测试,用于评估多模态大语言模型从科学绘图中生成代码的能力 large language model
4 News Recommendation with Category Description by a Large Language Model 提出一种基于大语言模型自动生成类别描述的新闻推荐方法,提升推荐效果。 large language model
5 Divergent Creativity in Humans and Large Language Models 对比人类与大语言模型,评估语义发散性以衡量创造力差异 large language model
6 LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language LlamaTurk:探索低资源语言场景下,开源大语言模型的适配方法 large language model
7 UCCIX: Irish-eXcellence Large Language Model UCCIX:面向极低资源爱尔兰语的大语言模型持续预训练框架 large language model
8 MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning MuMath-Code:结合工具使用LLM与多视角数据增强提升数学推理能力 large language model
9 Evaluating large language models in medical applications: a survey 综述医学领域大语言模型评估方法,应对医疗信息复杂性挑战。 large language model
10 Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness 系统评估检索增强大语言模型在生物医学NLP中的应用、鲁棒性和自知能力 large language model
11 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning EconLogicQA:经济领域序列推理问答基准,评估大语言模型逻辑能力 large language model
12 Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers 构建俄语多模态科学论文数据集,并测试现有语言模型在自动摘要任务上的性能。 multimodal
13 EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models 提出高效多样本推测解码EMS-SD,加速大语言模型推理。 large language model
14 Interpreting Latent Student Knowledge Representations in Programming Assignments 提出InfoOIRT模型,用于解释编程作业中学生的潜在知识表示 large language model
15 PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition PARDEN:通过重复输出来防御大语言模型的越狱攻击 large language model
16 Control Token with Dense Passage Retrieval 通过控制Token增强DPR模型,解决大语言模型中的幻觉问题 large language model
17 Many-Shot Regurgitation (MSR) Prompting 提出Many-Shot Regurgitation (MSR) prompting,用于评估大型语言模型的内容复述风险。 large language model
18 OpenLLM-Ro -- Technical Report on Open-source Romanian LLMs OpenLLM-Ro:首个开源罗马尼亚语基础及对话大语言模型 large language model
19 MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation MCS-SQL:利用多提示和多项选择提升文本到SQL生成的性能 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
20 Strategic Data Ordering: Enhancing Large Language Model Performance through Curriculum Learning 提出基于课程学习的数据排序策略,提升大语言模型性能 curriculum learning large language model
21 Simulate and Eliminate: Revoke Backdoors for Generative Large Language Models 提出SANDE框架,无需干净模型即可消除生成式大语言模型中的后门攻击。 reinforcement learning RLHF large language model
22 MetaReflection: Learning Instructions for Language Agents using Past Reflections MetaReflection:利用历史反思经验学习指令,提升语言Agent性能 reinforcement learning offline reinforcement learning large language model
23 Quantifying and Optimizing Global Faithfulness in Persona-driven Role-playing 提出APC指标量化角色扮演忠实度,并用于优化AI角色。 DPO direct preference optimization

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
24 Open-vocabulary Auditory Neural Decoding Using fMRI-prompted LLM 提出Brain Prompt GPT,利用fMRI信号提示LLM实现开放词汇听觉神经解码 open-vocabulary open vocabulary
25 Constructing a BPE Tokenization DFA 提出一种高效构建BPE分词确定性有限自动机(DFA)的算法,用于解决开放词汇问题。 open-vocabulary open vocabulary

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
26 MacBehaviour: An R package for behavioural experimentation on large language models MacBehaviour:用于大规模语言模型行为实验的R包 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页