cs.CL(2024-06-25)

📊 共 40 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (36 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (4)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (36 篇)

#题目一句话要点标签🔗
1 Panacea: A foundation model for clinical trial search, summarization, design, and recruitment 提出Panacea临床试验基础模型,解决临床试验多任务难题,提升搜索、总结、设计和招募效率。 large language model foundation model
2 Autonomous Prompt Engineering in Large Language Models 提出APET,利用GPT-4自主进行提示工程,提升LLM在特定任务上的性能 large language model chain-of-thought
3 CharED: Character-wise Ensemble Decoding for Large Language Models 提出CharED,一种字符级集成解码方法,提升大语言模型在多领域的性能。 large language model
4 Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback 提出基于关系元组、验证与动态反馈的框架,提升大语言模型算术推理能力 large language model
5 Accelerating Clinical Evidence Synthesis with Large Language Models TrialMind:利用大型语言模型加速临床证据合成,提升效率与准确性 large language model
6 Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language 构建Persuasive-Pairs数据集,评估并基准测试大型语言模型生成说服性语言的能力 large language model
7 Using Large Language Models in Public Transit Systems, San Antonio as a case study 利用大型语言模型优化公共交通系统:以圣安东尼奥为例 large language model
8 From Distributional to Overton Pluralism: Investigating Large Language Model Alignment 研究表明对齐后的LLM行为可由基础模型通过上下文学习复现 large language model
9 Evaluating Large Language Models with Psychometrics 提出心理测量基准,评估大型语言模型在心理学维度上的表现与一致性 large language model
10 CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference CoSafe:评估多轮对话指代消解中大型语言模型的安全性 large language model
11 Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models 对大型语言模型中的人格进行自评、展示与识别的综述 large language model
12 Multi-property Steering of Large Language Models with Dynamic Activation Composition 提出动态激活组合方法,实现大语言模型多属性可控生成,提升流畅性。 large language model
13 Entropy-Based Decoding for Retrieval-Augmented Large Language Models 提出基于熵的解码方法,解决检索增强大语言模型中的干扰问题 large language model
14 Enhancing Tool Retrieval with Iterative Feedback from Large Language Models 提出基于大语言模型迭代反馈的工具检索方法,提升复杂场景下的工具选择准确性 large language model
15 MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting 提出MoE-CT架构,解决LLM在持续训练中低资源语言性能下降问题 large language model
16 Generative AI Systems: A Systems-based Perspective on Generative AI 提出GenAISys:一个基于系统的视角来研究通用人工智能,关注多模态处理、内容生成和决策。 large language model multimodal
17 Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective 从特征解耦角度重新审视单义性,提出鼓励单义性提升模型能力 large language model
18 Unmasking the Imposters: How Censorship and Domain Adaptation Affect the Detection of Machine-Generated Tweets 研究审查与领域自适应对机器生成推文检测的影响,揭示“伪装者”的威胁。 large language model
19 Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework 提出SimsChat框架,利用LLM创建可定制的角色扮演智能体 large language model
20 Evaluating the Efficacy of Foundational Models: Advancing Benchmarking Practices to Enhance Fine-Tuning Decision-Making 提出ThroughCut异常检测技术,评估LLM在多领域微调前的基准性能 large language model
21 RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems RAGBench:可解释的检索增强生成系统评测基准 large language model
22 X-ray Made Simple: Lay Radiology Report Generation and Robust Evaluation 提出Layman's RRG框架,解决放射报告生成中评估鲁棒性不足和患者理解困难的问题 multimodal
23 The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators ALCHEmist:通过生成程序自动标注数据,成本仅为LLM标注的1/500 large language model
24 Following Length Constraints in Instructions 提出长度约束指令跟随模型,解决现有模型长度偏见问题,并在长度控制评估中超越GPT4等模型。 instruction following
25 LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users 揭示LLM对弱势用户群体的信息偏差:英语水平、教育程度与来源国的影响 large language model
26 VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation 提出VarBench,通过动态变量扰动实现对语言模型的稳健基准测试。 large language model
27 This Paper Had the Smartest Reviewers -- Flattery Detection Utilising an Audio-Textual Transformer-Based Approach 提出一种基于音频-文本Transformer的多模态方法,用于检测语音中的奉承行为。 multimodal
28 LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic LLM-ARC:利用自动推理评论家增强LLM的逻辑推理能力 large language model
29 Banishing LLM Hallucinations Requires Rethinking Generalization 重新思考泛化能力以消除大语言模型幻觉 large language model
30 "Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations? 利用少量解释,LLM可近似自然语言推理中人类判断分布,提升标注效率。 large language model
31 LongIns: A Challenging Long-context Instruction-based Exam for LLMs LongIns:一个用于评估LLM长文本理解与推理能力的指令型考试基准 large language model
32 Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats 提出IoT防御的文本到SQL框架,用于查询和分类IoT威胁,并构建了相关数据集。 large language model
33 FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating Toxicity in French Texts 提出FrenchToxicityPrompts,用于评估和缓解法语文本中的毒性问题。 large language model
34 The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale 提出FineWeb数据集,提升大规模语言模型预训练数据质量与性能 large language model
35 Retrieval-Augmented Code Generation for Situated Action Generation: A Case Study on Minecraft 利用检索增强的代码生成提升Minecraft情境动作生成性能 large language model
36 Disce aut Deficere: Evaluating LLMs Proficiency on the INVALSI Italian Benchmark 提出INVALSI基准以评估LLMs在意大利语的能力 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
37 Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels 利用LLM生成标签进行知识蒸馏,提升监督文本分类效率与成本效益 distillation large language model
38 Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain 提出基于偏好优化和期望信息增益的提问学习方法,提升LLM在信息搜寻任务中的表现。 DPO direct preference optimization large language model
39 PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning PAFT:一种高效LLM微调的并行训练范式,解决对齐税问题 DPO large language model
40 Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification 提出一种基于检索的上下文学习框架,用于解决少样本分层文本分类问题。 contrastive learning large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页