cs.CL（2024-10-17）

📊 共 64 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (55 🔗8) 支柱二：RL算法与架构 (RL & Architecture) (9 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (55 篇)

#	题目	一句话要点	标签	🔗
1	Eliciting Uncertainty in Chain-of-Thought to Mitigate Bias against Forecasting Harmful User Behaviors	利用思维链中的不确定性来缓解预测有害用户行为的偏差	large language model chain-of-thought
2	Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models	提出Shortcut Suite以评估大语言模型的快捷学习问题	large language model chain-of-thought	✅
3	Roadmap towards Superhuman Speech Understanding using Large Language Models	提出基于LLM的超人语音理解路线图与SAGI基准评测体系	large language model foundation model
4	Learning Multimodal Cues of Children's Uncertainty	构建儿童不确定性多模态线索数据集，并提出模型预测儿童不确定性	multimodal
5	Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors	揭示主观任务数据聚合伪影如何影响大语言模型后验分布	large language model
6	Semi-supervised Fine-tuning for Large Language Models	提出SemiEvol框架，通过半监督微调提升大语言模型在有限标注数据下的性能。	large language model
7	On the Role of Attention Heads in Large Language Model Safety	提出Safety Head ImPortant Score (Ships)和Sahara算法，用于评估和归因LLM中的安全注意力头。	large language model
8	UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models	提出UCFE：一个用户中心的金融专业知识基准，用于评估大型语言模型	large language model
9	RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs in Medicine	RiTeK：一个用于评估大语言模型在医学文本知识图谱上复杂推理能力的数据集	large language model
10	Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models	大型语言模型伦理研究白皮书：为LLM研究提供伦理指导与实践规范	large language model
11	De-mark: Watermark Removal in Large Language Models	提出De-mark框架，有效移除大型语言模型中基于n-gram的水印	large language model
12	Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval	提出知识感知的查询扩展框架，利用大语言模型提升文本和关系检索效果	large language model
13	SynapticRAG: Enhancing Temporal Memory Retrieval in Large Language Models through Synaptic Mechanisms	SynapticRAG：通过突触机制增强大语言模型中的时间记忆检索	large language model
14	Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR	结合参数高效微调与文本自适应，提升低资源ASR多语言多模态模型性能	multimodal
15	Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models	提出PTR框架，通过渐进式思维提炼提升大语言模型在开放场景下的性能	large language model
16	BiasJailbreak:Analyzing Ethical Biases and Jailbreak Vulnerabilities in Large Language Models	BiasJailbreak：揭示并利用大语言模型中的伦理偏见进行对抗攻击，并提出防御方法。	large language model
17	Advancing Large Language Model Attribution through Self-Improving	提出START框架，通过自学习迭代提升大语言模型的事实归因能力	large language model
18	Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis	提出语言混淆熵，量化评估大语言模型中的语言混淆现象，并分析其安全性影响。	large language model
19	CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy	提出CBT-BENCH基准，评估大型语言模型在认知行为疗法辅助中的能力	large language model
20	BQA: Body Language Question Answering Dataset for Video Large Language Models	提出BQA数据集，用于评估视频大语言模型对肢体语言的理解能力	large language model
21	Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models	评估自生成文档以增强大语言模型的检索增强生成效果	large language model
22	aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Processing	提出轻量高效的代码大语言模型aiXcoder-7B，提升代码补全精度与效率。	large language model
23	Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings	评估大语言模型在英语和低资源语言上的性能差异，揭示跨语言应用挑战	large language model
24	Data Defenses Against Large Language Models	提出数据防御方法，通过对抗性提示注入，保护数据免受大型语言模型的不当推断。	large language model	✅
25	Can MLLMs Understand the Deep Implication Behind Chinese Images?	提出CII-Bench基准，评估多模态大语言模型对中文图像深层含义的理解能力	large language model multimodal	✅
26	Retrospective Learning from Interactions	ReSpect：利用交互历史中的隐式反馈提升多模态LLM的推理能力	large language model multimodal
27	Generating Signed Language Instructions in Large-Scale Dialogue Systems	构建基于大型对话系统的手语指令生成系统，提升多模态交互体验。	large language model multimodal	✅
28	SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs	SimpleToM：揭示大语言模型在显式心理理论推理和隐式应用之间的差距	large language model chain-of-thought
29	GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models	GeoCoder：通过视觉-语言模型生成模块化代码解决几何问题	multimodal
30	Reference-Based Post-OCR Processing with LLM for Precise Diacritic Text in Historical Document Recognition	提出基于LLM和参考书的OCR后处理方法，提升古籍文字识别精度	large language model
31	Detecting AI-Generated Texts in Cross-Domains	提出RoBERTa-Ranker模型，解决跨领域AI生成文本检测性能下降问题	large language model
32	MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems	提出MIRAGE-Bench，用于自动评估多语言检索增强生成系统的基准测试平台。	large language model	✅
33	Retrieval of Temporal Event Sequences from Textual Descriptions	提出TPP-Embedding模型，用于从文本描述中检索时序事件序列，并构建了TESRBench基准。	large language model
34	Measuring and Modifying the Readability of English Texts with GPT-4	利用GPT-4评估并修改英文文本可读性，显著优于传统方法。	large language model
35	LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education	揭示LLM在个性化教育中作为“教师”的偏见，并提出评估指标。	large language model
36	From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization	提出多文档摘要中的幻觉问题研究以提升LLM性能	large language model
37	BenTo: Benchmark Task Reduction with In-Context Transferability	BenTo：利用上下文迁移性进行大模型评测基准任务缩减	large language model
38	Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions	通过建模未来对话轮次，提升LLM提问澄清问题的能力	large language model
39	Unconstrained Model Merging for Enhanced LLM Reasoning	提出一种无约束模型融合框架，提升LLM在推理任务上的性能	large language model
40	ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue Summarization	提出ORCHID中文辩论语料库，用于目标无关立场检测和辩论对话摘要。	large language model
41	A Comparative Study on Reasoning Patterns of OpenAI's o1 Model	对比研究OpenAI o1模型的推理模式，揭示其在数学、代码和常识推理上的优势	large language model
42	Bias in the Mirror: Are LLMs opinions robust to their own adversarial attacks ?	提出LLM自辩框架，评估模型偏见在对抗攻击下的鲁棒性	large language model
43	RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards	提出RAG-DDR，通过可微数据奖励优化检索增强生成，提升小模型知识利用率。	large language model	✅
44	IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection	IterSelectTune：一种高效指令调优数据选择的迭代训练框架	large language model
45	From Citations to Criticality: Predicting Legal Decision Influence in the Multilingual Swiss Jurisprudence	提出Criticality Prediction数据集，用于预测瑞士法律判决的影响力，优化案件优先级排序。	large language model
46	Judgment of Learning: A Human Ability Beyond Generative Artificial Intelligence	揭示大型语言模型元认知局限：在学习判断任务中表现不如人类	large language model
47	LAR-ECHR: A New Legal Argument Reasoning Task and Dataset for Cases of the European Court of Human Rights	提出LAR-ECHR数据集，用于评估LLM在欧洲人权法院案例中的法律推理能力	large language model
48	Cerberus: Efficient Inference with Adaptive Parallel Decoding and Sequential Knowledge Enhancement	Cerberus：通过自适应并行解码和序列知识增强实现高效LLM推理	large language model
49	Learning to Route LLMs with Confidence Tokens	提出Self-REF，通过置信度令牌提升LLM在下游任务中的可靠性和准确性。	large language model
50	Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning	提出MUNCH：基于不确定性的多跳知识遗忘方法，解决现有方法在间接推理上的不足	large language model
51	Atomic Calibration of LLMs in Long-Form Generations	提出原子校准方法，评估LLM在长文本生成中细粒度的幻觉问题。	large language model
52	SPIN: Self-Supervised Prompt INjection	SPIN：自监督提示注入，用于检测和防御大语言模型的对抗攻击	large language model
53	FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs	FaithBench：针对现代LLM摘要幻觉的多元化评测基准	large language model	✅
54	The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces	探究LLM数值推理几何：语言模型在线性子空间中比较数值属性	large language model
55	SLM-Mod: Small Language Models Surpass LLMs at Content Moderation	SLM-Mod：小语言模型在内容审核方面超越大型语言模型	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗
56	Representation Learning of Structured Data for Medical Foundation Models	UniStruct：针对医疗领域，提出结构化数据表征学习方法，提升医疗基础模型性能。	representation learning large language model foundation model
57	CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models	CLaMP 2：利用大语言模型实现101种语言的多模态音乐信息检索	contrastive learning large language model multimodal
58	Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation	提出 RaDis：通过理由蒸馏提升LLM翻译能力，同时避免通用能力损失	distillation large language model instruction following
59	An Active Learning Framework for Inclusive Generation by Large Language Models	提出基于聚类的主动学习框架，提升大语言模型对多样化子群体的生成能力。	distillation large language model
60	Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation	提出WMA：一种基于世界模型的Web Agent，提升Web导航任务决策能力	world model large language model
61	Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques	提出结合课程学习、半监督训练和优化算法的联合NLG/NLU文本生成增强方法	reinforcement learning curriculum learning
62	PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment	PopAlign：通过多样化对比模式实现更全面的大语言模型对齐	RLHF large language model
63	Learning Metadata-Agnostic Representations for Text-to-SQL In-Context Example Selection	提出MARLO，用于文本到SQL的上下文学习示例选择，提升泛化能力。	representation learning large language model
64	SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs	SeerAttention：学习LLM中的内在稀疏注意力，提升长文本处理效率。	distillation large language model	✅

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2024-10-17）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (55 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理