cs.CL(2024-06-14)

📊 共 32 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (26 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (26 篇)

#题目一句话要点标签🔗
1 Self-Reflection Makes Large Language Models Safer, Less Biased, and Ideologically Neutral 利用自反思提升大语言模型的安全性、公正性和意识形态中立性 large language model chain-of-thought
2 DevBench: A multimodal developmental benchmark for language learning DevBench:一个用于语言学习的多模态发展基准测试,旨在弥合模型与儿童语言学习的差距。 multimodal
3 CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making CliBench:一个多方面、多粒度的大语言模型临床决策评估基准 large language model
4 Evaluation of Large Language Models: STEM education and Gender Stereotypes 评估大型语言模型在STEM教育和性别刻板印象方面的偏差 large language model
5 RadEx: A Framework for Structured Information Extraction from Radiology Reports based on Large Language Models RadEx:基于大型语言模型的放射报告结构化信息抽取框架 large language model
6 CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models 提出CHiSafetyBench,用于评估中文大语言模型安全性的分层基准 large language model
7 A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations 综述:面向医疗应用的大语言模型,聚焦数据集、方法和评估 large language model
8 SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading SciEx:提出一个基于大学计算机科学考试题的LLM评测基准,包含人工和自动评分。 large language model
9 SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages SEACrowd:构建东南亚语言多模态数据中心与基准评测体系 multimodal
10 On the Evaluation of Speech Foundation Models for Spoken Language Understanding 评估语音基础模型以提升口语理解任务的效果 foundation model
11 Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering 提出融合图推理与大语言模型的对话式问答方法,提升复杂推理能力。 large language model
12 GEB-1.3B: Open Lightweight Large Language Model 提出轻量级开源大语言模型GEB-1.3B,优化CPU推理效率。 large language model
13 Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language Models 提出动态知识注入方法,提升语言模型在知识库视觉问答任务中的性能。 large language model multimodal
14 ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation ChartMimic:通过图表到代码生成评估LMM的跨模态推理能力 multimodal
15 A Training-free Sub-quadratic Cost Transformer Model Serving Framework With Hierarchically Pruned Attention 提出HiP:一种无需训练的子二次复杂度Transformer模型服务框架,通过分层剪枝注意力机制实现高效长文本处理。 large language model
16 HIRO: Hierarchical Information Retrieval Optimization 提出HIRO,通过深度优先搜索优化RAG中的层级信息检索,提升性能。 large language model
17 Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments 提出RAFTS:通过合成对比论证进行检索增强的事实核查 large language model
18 Domain-Specific Shorthand for Generation Based on Context-Free Grammar 提出基于上下文无关文法的领域特定速记方法,降低生成式AI中结构化数据生成的token数量。 large language model
19 Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs 提出Goldfish Loss,降低生成式LLM的记忆化风险,保护隐私和版权。 large language model
20 Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation 研究机器评估同声传译质量与人类评估的相关性,探索GPT模型的应用潜力 large language model
21 A Better LLM Evaluator for Text Generation: The Impact of Prompt Output Sequencing and Optimization 通过优化提示词结构提升LLM在文本生成评估中的性能 large language model
22 BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages BLEnD:一个评估LLM在多元文化和语言日常知识表现的基准 large language model
23 Rapport-Driven Virtual Agent: Rapport Building Dialogue Strategy for Improving User Experience at First Meeting 提出基于Rapport构建策略的虚拟代理对话系统,提升首次交互用户体验 large language model
24 On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey 综述LLM驱动的合成数据生成、管理与评估,填补领域框架空白。 large language model
25 Detecting Response Generation Not Requiring Factual Judgment 提出DDFC数据集,用于检测对话生成中无需事实性判断的句子 large language model
26 FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation FreeCtrl:通过前馈层构建控制中心,实现免学习的可控文本生成 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
27 Bootstrapping Language Models with DPO Implicit Rewards DICE:利用DPO隐式奖励自举语言模型,提升对齐效果。 reinforcement learning RLHF DPO
28 What is the best model? Application-driven Evaluation for Large Language Models A-Eval:面向实际应用的大语言模型评估基准,助力用户选择最优模型 reinforcement learning large language model foundation model
29 Knowledge Editing in Language Models via Adapted Direct Preference Optimization 提出基于改进直接偏好优化的知识编辑方法KDPO,提升语言模型知识更新效率。 DPO direct preference optimization large language model
30 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs 通过正则化隐状态,提升LLM奖励模型的泛化能力 reinforcement learning preference learning RLHF
31 Self-Knowledge Distillation for Learning Ambiguity 提出自知识蒸馏方法,解决语言模型在歧义样本上的过自信问题。 distillation
32 Pcc-tuning: Breaking the Contrastive Learning Ceiling in Semantic Textual Similarity Pcc-tuning:突破语义文本相似度对比学习的性能上限 contrastive learning

⬅️ 返回 cs.CL 首页 · 🏠 返回主页