cs.CL(2025-07-01)

📊 共 22 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗2) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 From Answers to Rationales: Self-Aligning Multimodal Reasoning with Answer-Oriented Chain-of-Thought 提出SMART框架,通过答案导向的思维链自对齐多模态推理,提升模型泛化性和鲁棒性。 large language model multimodal chain-of-thought
2 Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies 提出混合推理(MoR)框架,提升大语言模型在复杂任务中的自适应推理能力 large language model chain-of-thought
3 SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks 提出SciArena以解决科学文献任务评估的不足问题 foundation model
4 La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America 提出La Leaderboard,用于评估西班牙语及其变体的LLM性能 large language model
5 TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation TransLaw:多智能体协同翻译框架,用于香港法律判决的LLM基准测试 large language model
6 AI Analyst: Framework and Comprehensive Evaluation of Large Language Models for Financial Time Series Report Generation AI分析师:提出利用大型语言模型生成金融时间序列报告的框架与综合评估方法 large language model
7 Leveraging Large Language Models for Spontaneous Speech-Based Suicide Risk Detection 利用大型语言模型进行基于语音的自发性自杀风险检测 large language model
8 GAF-Guard: An Agentic Framework for Risk Management and Governance in Large Language Models GAF-Guard:面向大语言模型风险管理与治理的Agent框架 large language model
9 MassTool: A Multi-Task Search-Based Tool Retrieval Framework for Large Language Models MassTool:一种面向大语言模型的多任务搜索式工具检索框架 large language model
10 `For Argument's Sake, Show Me How to Harm Myself!': Jailbreaking LLMs in Suicide and Self-Harm Contexts 针对自杀和自残场景,提出多步Prompt对抗攻击方法,成功破解LLM安全防护。 large language model
11 Mathematics Isn't Culture-Free: Probing Cultural Gaps via Entity and Scenario Perturbations 通过实体和场景扰动探测文化差异对数学问题求解的影响 large language model
12 Stylometry recognizes human and LLM-generated texts in short samples 文体学可有效区分人类与LLM生成的短文本,解决模型归属与AI伦理问题 large language model
13 A Comparative Study of Competency Question Elicitation Methods from Ontology Requirements 对比本体需求中能力问题获取方法,揭示LLM生成CQ的优劣势。 large language model
14 Many LLMs Are More Utilitarian Than One 研究表明,多智能体LLM系统在道德判断上比单智能体更倾向功利主义。 large language model
15 LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing LitBench:用于可靠评估创意写作的基准和数据集 large language model
16 Transferable Modeling Strategies for Low-Resource LLM Tasks: A Prompt and Alignment-Based Approach 提出一种基于Prompt和对齐的迁移学习策略,用于解决低资源LLM任务。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
17 We Need Knowledge Distillation for Solving Math Word Problems 针对数学应用题,提出知识蒸馏压缩LLM,降低智能教育成本。 distillation large language model
18 SAFER: Probing Safety in Reward Models with Sparse Autoencoder SAFER:利用稀疏自编码器探究奖励模型中的安全性 reinforcement learning RLHF large language model
19 TeamCMU at Touché: Adversarial Co-Evolution for Advertisement Integration and Detection in Conversational Search 提出广告管理模块以解决对话搜索中的广告整合与检测问题 curriculum learning large language model
20 Causal Prompting for Implicit Sentiment Analysis with Large Language Models 提出CAPITAL框架以解决隐含情感分析中的因果推理问题 contrastive learning large language model chain-of-thought

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
21 MemeCMD: An Automatically Generated Chinese Multi-turn Dialogue Dataset with Contextually Retrieved Memes MemeCMD:提出一个自动生成的、基于上下文检索Meme的中文多轮对话数据集。 HuMoR multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
22 Pitfalls of Evaluating Language Models with Open Benchmarks 揭示开放基准测试中语言模型的数据泄露风险,并提出缓解策略 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页