cs.CL（2024-07-11）

📊 共 20 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (20 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (20 篇)

#	题目	一句话要点	标签	🔗
1	Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models	RuleBench：评估大语言模型推理规则遵循能力，并提出IRFT进行优化	large language model instruction following
2	Fault Diagnosis in Power Grids with Large Language Model	提出基于Prompt工程的大语言模型电力系统故障诊断方法	large language model chain-of-thought
3	Uncertainty Estimation of Large Language Models in Medical Question Answering	提出Two-phase Verification方法，提升医学问答中大语言模型的不确定性估计	large language model
4	Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation	利用Jailbreak提示评估大型语言模型对抗偏见诱导的鲁棒性	large language model
5	Evaluating Nuanced Bias in Large Language Model Free Response Answers	提出一种半自动化流程，用于评估大型语言模型自由回答中细微的偏见。	large language model
6	A Taxonomy for Data Contamination in Large Language Models	提出LLM数据污染分类法，分析污染类型对下游任务性能的影响	large language model
7	GTA: A Benchmark for General Tool Agents	提出GTA基准测试，评估通用工具智能体在真实场景下的工具使用能力	large language model multimodal	✅
8	Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting	Speculative RAG：通过草稿机制增强检索增强生成，提升准确率并降低延迟。	large language model
9	NinjaLLM: Fast, Scalable and Cost-effective RAG using Amazon SageMaker and AWS Trainium and Inferentia2	NinjaLLM：利用Amazon SageMaker和AWS Trainium/Inferentia2实现快速、可扩展且经济高效的RAG	large language model
10	GPT-4 is judged more human than humans in displaced and inverted Turing tests	GPT-4在移位和倒置图灵测试中被误判为人类的概率高于真人	large language model
11	Brief state of the art in social information mining: Practical application in analysis of trends in French legislative 2024	利用社交媒体挖掘技术分析2024年法国立法选举趋势	large language model
12	Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency	批判性分析：大型语言模型并非人类语言能力的完整复现	large language model
13	Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist	提出MathCheck数学推理评估框架，提升LLM数学能力评估的泛化性和鲁棒性	large language model
14	Towards Building Specialized Generalist AI with System 1 and System 2 Fusion	提出融合System 1和System 2的专业化通用人工智能（SGAI）框架，迈向AGI	large language model
15	Turn-Level Empathy Prediction Using Psychological Indicators	提出基于心理指标分解的turn-level共情预测方法，提升共情检测性能。	large language model
16	On the Universal Truthfulness Hyperplane Inside LLMs	探索LLM内部的通用真值超平面以解决幻觉问题	large language model
17	Investigating Public Fine-Tuning Datasets: A Complex Review of Current Practices from a Construction Perspective	综述公共微调数据集构建方法，助力大模型训练与发展	large language model
18	Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks	提出KVMerger，自适应合并KV缓存，提升LLM在长文本任务中的性能。	large language model
19	RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL	提出RB-SQL，一种基于检索的LLM框架，用于提升Text-to-SQL任务性能	large language model
20	Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis	利用多任务学习和认知评估理论分析购买后意图	large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2024-07-11）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (20 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理