cs.CL(2024-10-17)

📊 共 64 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (55 🔗8) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (55 篇)

#题目一句话要点标签🔗
1 Eliciting Uncertainty in Chain-of-Thought to Mitigate Bias against Forecasting Harmful User Behaviors 利用思维链中的不确定性来缓解预测有害用户行为的偏差 large language model chain-of-thought
2 Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models 提出Shortcut Suite以评估大语言模型的快捷学习问题 large language model chain-of-thought
3 Roadmap towards Superhuman Speech Understanding using Large Language Models 提出基于LLM的超人语音理解路线图与SAGI基准评测体系 large language model foundation model
4 Learning Multimodal Cues of Children's Uncertainty 构建儿童不确定性多模态线索数据集,并提出模型预测儿童不确定性 multimodal
5 Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors 揭示主观任务数据聚合伪影如何影响大语言模型后验分布 large language model
6 Semi-supervised Fine-tuning for Large Language Models 提出SemiEvol框架,通过半监督微调提升大语言模型在有限标注数据下的性能。 large language model
7 On the Role of Attention Heads in Large Language Model Safety 提出Safety Head ImPortant Score (Ships)和Sahara算法,用于评估和归因LLM中的安全注意力头。 large language model
8 UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models 提出UCFE:一个用户中心的金融专业知识基准,用于评估大型语言模型 large language model
9 RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs in Medicine RiTeK:一个用于评估大语言模型在医学文本知识图谱上复杂推理能力的数据集 large language model
10 Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models 大型语言模型伦理研究白皮书:为LLM研究提供伦理指导与实践规范 large language model
11 De-mark: Watermark Removal in Large Language Models 提出De-mark框架,有效移除大型语言模型中基于n-gram的水印 large language model
12 Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval 提出知识感知的查询扩展框架,利用大语言模型提升文本和关系检索效果 large language model
13 SynapticRAG: Enhancing Temporal Memory Retrieval in Large Language Models through Synaptic Mechanisms SynapticRAG:通过突触机制增强大语言模型中的时间记忆检索 large language model
14 Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR 结合参数高效微调与文本自适应,提升低资源ASR多语言多模态模型性能 multimodal
15 Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models 提出PTR框架,通过渐进式思维提炼提升大语言模型在开放场景下的性能 large language model
16 BiasJailbreak:Analyzing Ethical Biases and Jailbreak Vulnerabilities in Large Language Models BiasJailbreak:揭示并利用大语言模型中的伦理偏见进行对抗攻击,并提出防御方法。 large language model
17 Advancing Large Language Model Attribution through Self-Improving 提出START框架,通过自学习迭代提升大语言模型的事实归因能力 large language model
18 Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis 提出语言混淆熵,量化评估大语言模型中的语言混淆现象,并分析其安全性影响。 large language model
19 CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy 提出CBT-BENCH基准,评估大型语言模型在认知行为疗法辅助中的能力 large language model
20 BQA: Body Language Question Answering Dataset for Video Large Language Models 提出BQA数据集,用于评估视频大语言模型对肢体语言的理解能力 large language model
21 Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models 评估自生成文档以增强大语言模型的检索增强生成效果 large language model
22 aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Processing 提出轻量高效的代码大语言模型aiXcoder-7B,提升代码补全精度与效率。 large language model
23 Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings 评估大语言模型在英语和低资源语言上的性能差异,揭示跨语言应用挑战 large language model
24 Data Defenses Against Large Language Models 提出数据防御方法,通过对抗性提示注入,保护数据免受大型语言模型的不当推断。 large language model
25 Can MLLMs Understand the Deep Implication Behind Chinese Images? 提出CII-Bench基准,评估多模态大语言模型对中文图像深层含义的理解能力 large language model multimodal
26 Retrospective Learning from Interactions ReSpect:利用交互历史中的隐式反馈提升多模态LLM的推理能力 large language model multimodal
27 Generating Signed Language Instructions in Large-Scale Dialogue Systems 构建基于大型对话系统的手语指令生成系统,提升多模态交互体验。 large language model multimodal
28 SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs SimpleToM:揭示大语言模型在显式心理理论推理和隐式应用之间的差距 large language model chain-of-thought
29 GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models GeoCoder:通过视觉-语言模型生成模块化代码解决几何问题 multimodal
30 Reference-Based Post-OCR Processing with LLM for Precise Diacritic Text in Historical Document Recognition 提出基于LLM和参考书的OCR后处理方法,提升古籍文字识别精度 large language model
31 Detecting AI-Generated Texts in Cross-Domains 提出RoBERTa-Ranker模型,解决跨领域AI生成文本检测性能下降问题 large language model
32 MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems 提出MIRAGE-Bench,用于自动评估多语言检索增强生成系统的基准测试平台。 large language model
33 Retrieval of Temporal Event Sequences from Textual Descriptions 提出TPP-Embedding模型,用于从文本描述中检索时序事件序列,并构建了TESRBench基准。 large language model
34 Measuring and Modifying the Readability of English Texts with GPT-4 利用GPT-4评估并修改英文文本可读性,显著优于传统方法。 large language model
35 LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education 揭示LLM在个性化教育中作为“教师”的偏见,并提出评估指标。 large language model
36 From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization 提出多文档摘要中的幻觉问题研究以提升LLM性能 large language model
37 BenTo: Benchmark Task Reduction with In-Context Transferability BenTo:利用上下文迁移性进行大模型评测基准任务缩减 large language model
38 Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions 通过建模未来对话轮次,提升LLM提问澄清问题的能力 large language model
39 Unconstrained Model Merging for Enhanced LLM Reasoning 提出一种无约束模型融合框架,提升LLM在推理任务上的性能 large language model
40 ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue Summarization 提出ORCHID中文辩论语料库,用于目标无关立场检测和辩论对话摘要。 large language model
41 A Comparative Study on Reasoning Patterns of OpenAI's o1 Model 对比研究OpenAI o1模型的推理模式,揭示其在数学、代码和常识推理上的优势 large language model
42 Bias in the Mirror: Are LLMs opinions robust to their own adversarial attacks ? 提出LLM自辩框架,评估模型偏见在对抗攻击下的鲁棒性 large language model
43 RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards 提出RAG-DDR,通过可微数据奖励优化检索增强生成,提升小模型知识利用率。 large language model
44 IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection IterSelectTune:一种高效指令调优数据选择的迭代训练框架 large language model
45 From Citations to Criticality: Predicting Legal Decision Influence in the Multilingual Swiss Jurisprudence 提出Criticality Prediction数据集,用于预测瑞士法律判决的影响力,优化案件优先级排序。 large language model
46 Judgment of Learning: A Human Ability Beyond Generative Artificial Intelligence 揭示大型语言模型元认知局限:在学习判断任务中表现不如人类 large language model
47 LAR-ECHR: A New Legal Argument Reasoning Task and Dataset for Cases of the European Court of Human Rights 提出LAR-ECHR数据集,用于评估LLM在欧洲人权法院案例中的法律推理能力 large language model
48 Cerberus: Efficient Inference with Adaptive Parallel Decoding and Sequential Knowledge Enhancement Cerberus:通过自适应并行解码和序列知识增强实现高效LLM推理 large language model
49 Learning to Route LLMs with Confidence Tokens 提出Self-REF,通过置信度令牌提升LLM在下游任务中的可靠性和准确性。 large language model
50 Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning 提出MUNCH:基于不确定性的多跳知识遗忘方法,解决现有方法在间接推理上的不足 large language model
51 Atomic Calibration of LLMs in Long-Form Generations 提出原子校准方法,评估LLM在长文本生成中细粒度的幻觉问题。 large language model
52 SPIN: Self-Supervised Prompt INjection SPIN:自监督提示注入,用于检测和防御大语言模型的对抗攻击 large language model
53 FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs FaithBench:针对现代LLM摘要幻觉的多元化评测基准 large language model
54 The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces 探究LLM数值推理几何:语言模型在线性子空间中比较数值属性 large language model
55 SLM-Mod: Small Language Models Surpass LLMs at Content Moderation SLM-Mod:小语言模型在内容审核方面超越大型语言模型 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
56 Representation Learning of Structured Data for Medical Foundation Models UniStruct:针对医疗领域,提出结构化数据表征学习方法,提升医疗基础模型性能。 representation learning large language model foundation model
57 CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models CLaMP 2:利用大语言模型实现101种语言的多模态音乐信息检索 contrastive learning large language model multimodal
58 Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation 提出 RaDis:通过理由蒸馏提升LLM翻译能力,同时避免通用能力损失 distillation large language model instruction following
59 An Active Learning Framework for Inclusive Generation by Large Language Models 提出基于聚类的主动学习框架,提升大语言模型对多样化子群体的生成能力。 distillation large language model
60 Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation 提出WMA:一种基于世界模型的Web Agent,提升Web导航任务决策能力 world model large language model
61 Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques 提出结合课程学习、半监督训练和优化算法的联合NLG/NLU文本生成增强方法 reinforcement learning curriculum learning
62 PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment PopAlign:通过多样化对比模式实现更全面的大语言模型对齐 RLHF large language model
63 Learning Metadata-Agnostic Representations for Text-to-SQL In-Context Example Selection 提出MARLO,用于文本到SQL的上下文学习示例选择,提升泛化能力。 representation learning large language model
64 SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs SeerAttention:学习LLM中的内在稀疏注意力,提升长文本处理效率。 distillation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页