cs.AI(2025-09-27)

📊 共 28 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (23 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (23 篇)

#题目一句话要点标签🔗
1 ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following 提出ABC-Eval基准,评估大语言模型在符号音乐理解和指令跟随方面的能力 large language model instruction following
2 Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned 提出混合数据合成框架和感知聚焦监督,提升视觉语言模型多模态推理能力 large language model multimodal visual grounding
3 AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models 提出AudioRole数据集,提升大语言模型在语音角色扮演中的性能 large language model multimodal
4 Transferring Vision-Language-Action Models to Industry Applications: Architectures, Performance, and Challenges 评估并改进视觉-语言-动作模型在工业场景的应用性能 vision-language-action VLA
5 Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark 提出EAPrivacy基准,评估具身智能体在物理世界中的隐私意识 large language model
6 Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration 提出Fact Grounded Attention,通过知识注入注意力机制消除大语言模型的事实幻觉。 large language model
7 Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models 提出基于命题推理的心智意象任务,评估大语言模型复杂认知能力 large language model
8 CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models 提出CATMark上下文感知阈值水印框架,提升大语言模型跨任务水印的鲁棒性与文本质量。 large language model
9 GUI-PRA: Process Reward Agent for GUI Tasks GUI-PRA:用于GUI任务的过程奖励Agent,解决长程任务中的“中间迷失”和UI状态感知问题 large language model multimodal
10 Agentic AI Reasoning for Mobile Edge General Intelligence: Fundamentals, Approaches, and Directions 提出基于Agentic AI的移动边缘通用智能推理框架,优化资源受限环境下的LLM部署。 large language model chain-of-thought
11 VeriGRAG: Enhancing LLM-Based Verilog Code Generation with Structure-Aware Soft Prompts VeriGRAG:利用结构感知软提示增强LLM的Verilog代码生成 large language model multimodal
12 Your Dense Retriever is Secretly an Expeditious Reasoner 提出AdaQR,自适应混合查询重写框架,提升推理检索效率。 large language model
13 PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation PARROT:用于评估LLM跨系统SQL转换能力的基准测试 large language model
14 Understanding and Enhancing the Planning Capability of Language Models via Multi-Token Prediction 通过多Token预测增强语言模型在复杂规划中的传递关系学习能力 large language model
15 MathBode: Measuring the Stability of LLM Reasoning using Frequency Response MathBode:利用频率响应测量LLM推理的稳定性 large language model
16 ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search 提出ReliabilityRAG,利用文档可靠性信息增强RAG在Web搜索中的鲁棒性,防御提示注入攻击。 large language model
17 Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores 提出基于模型一致性的LLM Elo评分代理,无需人工评估且高效预测模型性能 large language model
18 GeoBS: Information-Theoretic Quantification of Geographic Bias in AI Models 提出GeoBS:一个基于信息论的地理偏见量化框架,用于评估AI模型中的地域偏差。 foundation model
19 NeuroBridge: Using Generative AI to Bridge Cross-neurotype Communication Differences through Neurotypical Perspective-taking NeuroBridge:利用生成式AI和神经典型视角弥合跨神经类型沟通差异 large language model
20 Scaling LLM Test-Time Compute with Mobile NPU on Smartphones 提出面向移动NPU的LLM测试时并行扩展技术,提升小模型性能。 large language model
21 p-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding 提出p-less采样,一种无超参数的LLM解码方法,提升生成质量和效率。 large language model
22 AutoEP: LLMs-Driven Automation of Hyperparameter Evolution for Metaheuristic Algorithms AutoEP:利用LLM驱动的超参数进化自动优化元启发式算法 large language model
23 Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents Kimi-Dev:基于无Agent训练的技能先验提升软件工程Agent性能 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
24 Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning 提出Tool-Light框架,通过自进化偏好学习提升LLM工具集成推理的效率与准确性 preference learning DPO direct preference optimization
25 Multiplayer Nash Preference Optimization 提出MNPO,将NLHF扩展到多人博弈,提升复杂偏好对齐效果 reinforcement learning RLHF large language model
26 Mapping Overlaps in Benchmarks through Perplexity in the Wild 通过困惑度分析,揭示大语言模型评测基准的重叠与差异 world model large language model instruction following
27 Risk Profiling and Modulation for LLMs 提出LLM风险画像与调控流程,探索后训练对风险偏好的影响 RLHF large language model
28 Coordination Requires Simplification: Thermodynamic Bounds on Multi-Objective Compromise in Natural and Artificial Intelligence 提出热力学协调理论,揭示多目标协调需简化信息以应对热力学约束。 reinforcement learning large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页