cs.CL(2024-06-21)

📊 共 35 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (31 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (31 篇)

#题目一句话要点标签🔗
1 From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking 综述性研究:探索大语言模型与多模态大语言模型的越狱攻击 large language model multimodal
2 Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph LM-Polygraph:用于大规模语言模型不确定性量化的综合基准测试 large language model
3 Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization 重新思考大语言模型剪枝:重建误差最小化的益处与陷阱 large language model
4 Large Language Models have Intrinsic Self-Correction Ability 揭示大语言模型内在自纠错能力,强调零温度与公正提示的重要性 large language model
5 Talking the Talk Does Not Entail Walking the Walk: On the Limits of Large Language Models in Lexical Entailment Recognition 评估大型语言模型在词汇蕴含识别中的局限性,揭示其在动词语义理解上的挑战 large language model
6 ICLEval: Evaluating In-Context Learning Ability of Large Language Models ICLEval:提出评估大语言模型上下文学习能力的新基准 large language model
7 ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models 提出ESC-Eval框架以评估大语言模型的情感支持对话 large language model
8 Safely Learning with Private Data: A Federated Learning Framework for Large Language Model 提出FL-GLM:一种面向大语言模型的安全联邦学习框架,解决隐私泄露和效率问题。 large language model
9 InternLM-Law: An Open Source Chinese Legal Large Language Model 提出 InternLM-Law,一个开源的中文法律大语言模型,用于解决法律领域的复杂查询。 large language model
10 70B-parameter large language models in Japanese medical question-answering 利用70B参数大语言模型,通过指令微调提升日语医疗问答能力 large language model
11 Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video 提出Sports Intelligence基准,评估语言模型在体育理解方面的能力,填补多模态体育理解的空白。 large language model multimodal chain-of-thought
12 Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models 提出PE-Rank,利用段落嵌入加速大语言模型列表式重排序。 large language model
13 Towards Retrieval Augmented Generation over Large Video Libraries 提出基于检索增强生成的大型视频库问答系统,助力视频内容高效再利用 large language model TAMP
14 DEM: Distribution Edited Model for Training with Mixed Data Distributions 提出分布编辑模型(DEM),高效解决混合数据分布下的模型训练难题 instruction following
15 Synthetic Lyrics Detection Across Languages and Genres 提出合成歌词检测方法以解决版权和内容透明性问题 large language model
16 Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network 提出基于浅层无监督多头注意力网络的类脑语言处理模型 large language model
17 ToVo: Toxicity Taxonomy via Voting ToVo:提出一种基于投票机制的毒性内容分类方法,解决现有模型透明性、定制性和可复现性不足的问题。 chain-of-thought
18 Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks 提出ArGPT数据集,用于评估和提升大型语言模型生成论证的质量。 large language model
19 News Deja Vu: Connecting Past and Present with Semantic Search News Deja Vu:利用语义搜索连接历史与现代新闻,辅助社会科学研究。 large language model
20 Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods 综述AI生成文本检测方法,分析影响可检测性的关键因素,并为未来研究提供建议。 large language model
21 How language models extrapolate outside the training data: A case study in Textualized Gridworld 提出基于认知地图的CoT框架,提升语言模型在文本化Gridworld中的外推能力 chain-of-thought
22 Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics 提出细粒度引用评估框架,分析现有忠实度指标在生成文本中的有效性 large language model
23 A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation 提出基于LLM排序的自动反叙事生成评估方法,显著提升与人类判断的相关性。 large language model
24 Unsupervised Extraction of Dialogue Policies from Conversations 提出一种基于图结构的无监督对话策略提取方法,提升任务型对话系统开发效率。 large language model
25 PARIKSHA: A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data PARIKSHA:大规模研究人类与LLM评估器在多语言和多文化数据上的一致性 large language model
26 Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation 提出Retrieve-Plan-Generation框架,通过迭代规划和检索增强知识密集型LLM生成。 large language model
27 A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems RAG系统中Base LLM性能优于Instruct LLM,平均提升20% large language model
28 AgriLLM: Harnessing Transformers for Farmer Queries AgriLLM:利用Transformer解决农民咨询问题 large language model
29 OATH-Frames: Characterizing Online Attitudes Towards Homelessness with LLM Assistants 提出OATH-Frames框架,利用LLM辅助分析在线媒体中对无家可归者的态度 large language model
30 Word Matters: What Influences Domain Adaptation in Summarization? 研究词汇对摘要生成领域自适应的影响,提出基于学习难度的性能预测方法 large language model
31 How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions 基于霍夫斯泰德文化维度,评估大型语言模型在跨文化价值观上的表现 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
32 Hybrid Alignment Training for Large Language Models 提出混合对齐训练Hbat,解决大语言模型指令遵循与偏好对齐冲突问题 direct preference optimization large language model instruction following
33 Direct Multi-Turn Preference Optimization for Language Agents 提出DMPO,通过优化状态-动作占用度量解决多轮语言Agent的直接偏好优化问题 reinforcement learning DPO direct preference optimization
34 Error Correction in Radiology Reports: A Knowledge Distillation-Based Multi-Stage Framework 提出基于知识蒸馏的多阶段框架,用于放射报告的错误纠正。 distillation large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
35 SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation 提出 SpreadsheetBench,一个基于真实场景的电子表格操作评测基准。 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页