| 1 |
MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions |
MoHoBench:通过无法回答的视觉问题评估多模态大语言模型的诚实性 |
preference learning large language model multimodal |
✅ |
|
| 2 |
Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security |
提出SecTOW,通过强化学习迭代攻防训练提升多模态大模型的安全性。 |
reinforcement learning large language model multimodal |
|
|
| 3 |
UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding |
UI-AGILE:通过强化学习和精确推理时定位提升GUI智能体性能 |
reinforcement learning large language model multimodal |
✅ |
|
| 4 |
Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems |
提出基于大语言模型的网络攻击可解释检测框架,用于自动发电控制系统 |
MAE large language model |
|
|
| 5 |
ChemDFM-R: A Chemical Reasoning LLM Enhanced with Atomized Chemical Knowledge |
ChemDFM-R:通过原子化化学知识增强的化学推理大语言模型 |
reinforcement learning distillation large language model |
|
|
| 6 |
Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics |
Assistax:一个用于辅助机器人的硬件加速强化学习基准测试平台 |
reinforcement learning |
✅ |
|
| 7 |
CoEx -- Co-evolving World-model and Exploration |
CoEx:通过协同演化的世界模型和探索解决LLM智能体规划中的知识偏差问题 |
world model |
|
|
| 8 |
Multi-modal Relational Item Representation Learning for Inferring Substitutable and Complementary Items |
提出MMSC框架,利用多模态关系学习推断可替代和互补商品,解决用户行为噪声和数据稀疏性问题。 |
representation learning |
|
|
| 9 |
Reasoning Language Models for Root Cause Analysis in 5G Wireless Networks |
提出基于领域知识增强的LLM框架,用于5G无线网络根因分析 |
reinforcement learning large language model |
|
|
| 10 |
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity |
提出EDGE-GRPO算法,通过熵驱动优势函数和引导式纠错解决GRPO中的优势坍塌问题 |
reinforcement learning large language model |
✅ |
|
| 11 |
What Does it Mean for a Neural Network to Learn a "World Model"? |
为神经网络学习“世界模型”提出可操作的评估标准 |
world model |
|
|
| 12 |
Exploring the Stratified Space Structure of an RL Game with the Volume Growth Transform |
利用体积增长变换探索强化学习游戏中Transformer模型的层化空间结构 |
reinforcement learning PPO |
|
|