| 22 |
Spatial Representation Learning Beyond Pixels: Unifying Raster Data and Vector Semantics for Human-Centric Geospatial Foundation Models |
提出统一空间表征学习框架,融合栅格数据与矢量语义,构建以人为中心的地理空间基础模型。 |
representation learning foundation model multimodal |
|
|
| 23 |
TrafficRAG: A Multimodal RAG Framework for Traffic Accident Liability Determination |
TrafficRAG:多模态检索增强框架,用于交通责任事故判定 |
MAE large language model multimodal |
|
|
| 24 |
Explainable Data-driven Deep Reinforcement Learning Methods for Optimal Energy Management in Buildings |
提出可解释深度强化学习框架,优化建筑能源管理并提升用户信任 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 25 |
COMAP: Co-Evolving World Models and Agent Policies for LLM Agents |
COMAP:面向LLM Agent的协同进化世界模型与策略,提升交互环境决策能力 |
world model world models distillation |
✅ |
|
| 26 |
Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space |
Echo:基于共享隐空间的联合嵌入预测架构,用于说话人分离和语音识别 |
JEPA Joint-Embedding Predictive Architecture joint-embedding predictive architecture |
|
|
| 27 |
EvoBrain: Continual Learning of EEG Foundation Models Across Heterogeneous BCI Tasks |
EvoBrain:面向异构BCI任务的脑电基础模型持续学习框架 |
distillation foundation model |
|
|
| 28 |
SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment |
SafeSteer:面向安全对齐的局部化On-Policy蒸馏方法 |
distillation large language model |
✅ |
|
| 29 |
Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning |
EAPO:通过学习何时不行动来缓解Agentic强化学习中的工具滥用问题 |
reinforcement learning policy learning reward shaping |
|
|
| 30 |
SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning |
SafeMCP:通过环境感知的前瞻推理实现LLM Agent的主动式能力管控 |
reinforcement learning world model world models |
|
|
| 31 |
Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation |
提出MEDEA模型,通过模拟社群共鸣评估用户生成内容质量,超越传统视觉保真度指标。 |
reinforcement learning multimodal chain-of-thought |
|
|
| 32 |
SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training |
SIRI:通过自内部化强化学习与内在技能训练LLM Agent |
reinforcement learning distillation |
✅ |
|
| 33 |
S-SPPO: Semantic-Calibrated Self-Play Preference Optimization |
S-SPPO:通过语义校准的自博弈偏好优化,解决LLM对齐中的策略退化问题 |
DPO direct preference optimization large language model |
✅ |
|
| 34 |
Coordination Graphs for Constrained Multi-Agent Reinforcement Learning |
提出CG-CMARL框架,通过协调图和拉格朗日对偶解决约束多智能体强化学习问题 |
reinforcement learning reward shaping |
|
|
| 35 |
Physically-Constrained Mamba-SDE for Remaining Useful Life Prediction under Irregular Observations |
提出PC-MambaSDE,解决不规则观测下剩余寿命预测的物理约束问题 |
latent dynamics Mamba |
|
|
| 36 |
JenBridge: Adaptive Long-Form Video Soundtracking across Scene Transitions |
提出JenBridge,解决长视频场景过渡中配乐连贯性问题 |
flow matching large language model |
|
|
| 37 |
Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses |
Harness-1:利用强化学习和外部状态管理提升搜索Agent性能 |
reinforcement learning |
✅ |
|
| 38 |
EVA-Net: Subject-Independent EEG Motor Decoding with Video-Derived Motor Priors |
EVA-Net:利用视频运动先验实现与受试者无关的脑电运动解码 |
distillation multimodal |
|
|
| 39 |
TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment |
TriAlign:面向个性化LLM对齐的通用真值一致性方法 |
reinforcement learning large language model |
|
|
| 40 |
TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL |
TRON:面向视觉推理强化学习的可控规则验证在线环境 |
reinforcement learning multimodal |
|
|
| 41 |
ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL |
ReSkill:在Agentic RL中协调技能创建与策略优化 |
reinforcement learning policy learning |
|
|