| 1 |
Robust Multimodal Representation Learning in Healthcare |
提出双流特征解耦框架,解决医疗多模态表征学习中的偏差问题 |
representation learning multimodal |
|
|
| 2 |
Factored Causal Representation Learning for Robust Reward Modeling in RLHF |
提出分解式因果表示学习,增强RLHF中奖励模型的鲁棒性 |
reinforcement learning RLHF representation learning |
|
|
| 3 |
Heterogeneous Vertiport Selection Optimization for On-Demand Air Taxi Services: A Deep Reinforcement Learning Approach |
提出基于深度强化学习的异构垂直起降场选择优化方法,提升按需空中出租车服务效率。 |
reinforcement learning deep reinforcement learning multimodal |
✅ |
|
| 4 |
Visual Disentangled Diffusion Autoencoders: Scalable Counterfactual Generation for Foundation Models |
提出视觉解耦扩散自编码器以解决基础模型的反事实生成问题 |
distillation foundation model |
|
|
| 5 |
Rethinking Federated Graph Foundation Models: A Graph-Language Alignment-based Approach |
提出FedGALA框架,通过图文对齐解决联邦图基础模型中的知识损失与异构问题。 |
contrastive learning foundation model |
|
|
| 6 |
NetMamba+: A Framework of Pre-trained Models for Efficient and Accurate Network Traffic Classification |
NetMamba+:用于高效准确网络流量分类的预训练模型框架 |
Mamba multimodal |
|
|
| 7 |
Expected Return Causes Outcome-Level Mode Collapse in Reinforcement Learning and How to Fix It with Inverse Probability Scaling |
提出逆概率缩放的GRPO算法,解决强化学习中期望回报导致的模式崩塌问题 |
reinforcement learning multimodal |
|
|
| 8 |
Mitigating Overthinking in Large Reasoning Models via Difficulty-aware Reinforcement Learning |
提出难度感知强化学习DiPO,缓解大型推理模型中的过度思考问题 |
reinforcement learning chain-of-thought |
|
|
| 9 |
READY: Reward Discovery for Meta-Black-Box Optimization |
READY:基于奖励发现的元黑盒优化方法,利用LLM自动设计奖励函数。 |
reinforcement learning reward design large language model |
|
|
| 10 |
When does predictive inverse dynamics outperform behavior cloning? |
提出预测逆动力学模型,在模仿学习中实现更优的偏差-方差权衡 |
imitation learning behavior cloning |
|
|
| 11 |
The Surprising Difficulty of Search in Model-Based Reinforcement Learning |
模型预测控制中搜索并非万能:缓解分布偏移比提高模型精度更重要 |
reinforcement learning model-based RL |
|
|
| 12 |
Prior-Informed Flow Matching for Graph Reconstruction |
提出Prior-Informed Flow Matching (PIFM)用于图重建,提升重建精度。 |
flow matching |
|
|
| 13 |
Negatives-Dominant Contrastive Learning for Generalization in Imbalanced Domains |
提出负样本主导的对比学习方法,解决不平衡域泛化问题。 |
contrastive learning |
✅ |
|
| 14 |
Constrained Meta Reinforcement Learning with Provable Test-Time Safety |
提出可验证测试时安全性的约束元强化学习算法 |
reinforcement learning |
|
|
| 15 |
Curriculum Learning for LLM Pretraining: An Analysis of Learning Dynamics |
课程学习提升LLM预训练稳定性,通过控制梯度方差优化模型 |
curriculum learning |
|
|
| 16 |
Epistemic Uncertainty Quantification for Pre-trained VLMs via Riemannian Flow Matching |
提出REPVLM,通过黎曼流匹配量化预训练VLM的认知不确定性 |
flow matching |
|
|
| 17 |
Generative Design of Ship Propellers using Conditional Flow Matching |
利用条件流匹配生成式设计船用螺旋桨 |
flow matching |
|
|
| 18 |
Reinforcement Learning for Adaptive Composition of Quantum Circuit Optimisation Passes |
提出基于强化学习的量子电路优化Pass自适应组合方法 |
reinforcement learning |
|
|
| 19 |
Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening |
提出可扩展的Power Sampling方法,通过分布锐化实现LLM高效无训练推理 |
reinforcement learning large language model |
|
|
| 20 |
Explicit Credit Assignment through Local Rewards and Dependence Graphs in Multi-Agent Reinforcement Learning |
提出基于局部奖励和依赖图的MARL方法,显式解决多智能体信用分配问题。 |
reinforcement learning |
|
|
| 21 |
HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing |
提出HER框架以解决LLM角色扮演中的认知模拟问题 |
reinforcement learning |
|
|
| 22 |
Grounding and Enhancing Informativeness and Utility in Dataset Distillation |
提出InfoUtil框架,通过信息量和效用最大化实现数据集蒸馏性能提升 |
distillation |
|
|
| 23 |
Physics-Guided Tiny-Mamba Transformer for Reliability-Aware Early Fault Warning |
提出物理引导的Tiny-Mamba Transformer以解决旋转机械早期故障预警问题 |
Mamba |
|
|
| 24 |
Less Noise, More Voice: Reinforcement Learning for Reasoning via Instruction Purification |
LENS:通过指令净化进行强化学习推理,提升LLM在复杂任务中的探索效率和训练稳定性。 |
reinforcement learning |
|
|
| 25 |
Signal-Adaptive Trust Regions for Gradient-Free Optimization of Recurrent Spiking Neural Networks |
提出信号自适应信任域(SATR)优化RSNN,提升高维强化学习控制性能。 |
reinforcement learning PPO |
|
|