cs.CL（2025-05-08）

📊 共 24 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (19 🔗5) 支柱二：RL算法与架构 (RL & Architecture) (5)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding	提出WiserUI-Bench基准，评估MLLM在理解UI/UX设计对用户行为影响方面的能力	large language model multimodal
2	Chain-of-Thought Tokens are Computer Program Variables	研究表明CoT中的Token类似于程序变量，可有效解决复杂推理任务	large language model chain-of-thought	✅
3	Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design	设计具备论证能力的“合理鹦鹉”型大语言模型，提升批判性思维能力	large language model
4	A Benchmark Dataset and a Framework for Urdu Multimodal Named Entity Recognition	提出U-MNER框架与Twitter2015-Urdu数据集，推进乌尔都语多模态命名实体识别研究。	multimodal
5	Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders	利用稀疏自编码器揭示大型语言模型中的语言特定特征	large language model	✅
6	Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization	评估大型语言模型在孟加拉语消费者健康查询摘要任务中的性能	large language model
7	Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization	提出基于EK-FAC的大规模多阶段影响函数，用于分析微调LLM对预训练数据的依赖	large language model	✅
8	Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging	通过模型融合，将大型语言模型的推理能力迁移至视觉-语言模型	large language model multimodal
9	Crosslingual Reasoning through Test-Time Scaling	通过测试时缩放提升英语中心语言模型跨语言推理能力	large language model chain-of-thought
10	KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification	提出KG-HTC，通过融合知识图谱与LLM，有效解决零样本分层文本分类问题。	large language model	✅
11	ComPO: Preference Alignment via Comparison Oracles	提出ComPO，通过比较Oracle进行偏好对齐，解决LLM中的噪声偏好问题	large language model
12	UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections	构建英国大选误导性叙事数据集，并评估大型语言模型检测能力。	large language model
13	clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations	clem todd：用于系统评测基于LLM的任务型对话系统实现的框架	large language model
14	Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data	Ultra-FineWeb：高效数据过滤与验证，提升大语言模型训练数据质量	large language model
15	Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans?	研究表明，大型语言模型比人类更易生成带有偏见的新闻标题。	large language model
16	RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection	提出RICo，通过上下文学习改进指令微调数据选择，提升大模型性能。	large language model
17	Product of Experts with LLMs: Boosting Performance on ARC Is a Matter of Perspective	利用专家乘积与LLM提升ARC性能：视角是关键	large language model
18	Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction	提出基于多尺度共形预测的零样本机器生成文本检测框架，可靠控制误报率。	large language model
19	Rethinking Invariance in In-context Learning	提出InvICL，解决上下文学习中对示例顺序敏感且现有不变方法性能不足的问题。	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Enhanced Urdu Intent Detection with Large Language Models and Prototype-Informed Predictive Pipelines	提出LLMPIA框架，提升大型语言模型在乌尔都语意图检测中的性能	representation learning contrastive learning large language model
21	Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes	提出隐式偏好编码(LPC)，通过离散隐变量对齐大语言模型，提升对人类偏好的建模能力。	DPO large language model
22	Scalable LLM Math Reasoning Acceleration with Low-rank Distillation	Caprese：低秩蒸馏加速LLM数学推理，显著降低计算成本	distillation large language model
23	Scaling Laws for Speculative Decoding	针对思辨解码，提出Log-linear Scaling Laws，加速LLM推理。	RLHF large language model chain-of-thought
24	Reasoning Models Don't Always Say What They Think	评估思维链模型的忠实性，揭示其推理过程与实际行为的不一致性	reinforcement learning chain-of-thought

⬅️ 返回 cs.CL 首页 · 🏠 返回主页