cs.AI(2025-04-21)
📊 共 30 篇论文 | 🔗 4 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (17 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 18 | KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking | KGMEL:提出知识图谱增强的多模态实体链接框架,提升实体对齐精度。 | contrastive learning large language model multimodal | ✅ | |
| 19 | Establishing Reliability Metrics for Reward Models in Large Language Models | 提出RETA指标,用于量化评估大型语言模型奖励模型的可靠性。 | reinforcement learning RLHF large language model | ||
| 20 | DRAGON: Distributional Rewards Optimize Diffusion Generative Models | 提出DRAGON框架以优化生成模型的奖励函数 | reinforcement learning RLHF DPO | ✅ | |
| 21 | Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs | 利用符号执行增强奖励模型,提升代码生成LLM微调效果 | reinforcement learning direct preference optimization large language model | ||
| 22 | Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision | 提出T2DA,利用自然语言监督离线元强化学习,实现文本到决策的零样本泛化。 | reinforcement learning world model | ✅ | |
| 23 | Acting Less is Reasoning More! Teaching Model to Act Efficiently | 提出OTC-PO,提升工具集成推理中LLM的效率,减少冗余工具调用。 | reinforcement learning PPO large language model | ||
| 24 | Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning | 提出PURE,通过最小形式奖励分配解决过程奖励模型中的奖励利用问题 | reinforcement learning large language model | ✅ | |
| 25 | Mitigating Degree Bias in Graph Representation Learning with Learnable Structural Augmentation and Structural Self-Attention | DegFairGT:通过可学习结构增强和结构自注意力缓解图表示学习中的度偏差 | representation learning | ||
| 26 | A Self-supervised Learning Method for Raman Spectroscopy based on Masked Autoencoders | 提出基于掩码自编码器的拉曼光谱自监督学习方法,提升光谱分析性能。 | masked autoencoder | ||
| 27 | Learning Adaptive Parallel Reasoning with Language Models | 提出自适应并行推理(APR)框架,提升语言模型在复杂推理任务中的性能和效率。 | reinforcement learning chain-of-thought | ||
| 28 | Contemplative Artificial Intelligence | 提出“沉思型人工智能”,通过内省原则提升AI安全性与合作性。 | world model chain-of-thought | ||
| 29 | aiXamine: Simplified LLM Safety and Security | aiXamine:简化LLM安全性和安全性的综合黑盒评估平台 | distillation large language model | ||
| 30 | EducationQ: Evaluating LLMs' Teaching Capabilities Through Multi-Agent Dialogue Framework | EducationQ:通过多智能体对话框架评估LLM的教学能力 | teacher-student large language model |