| 1 |
MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT |
MMBind:利用分布式异构数据进行物联网多模态学习 |
contrastive learning foundation model multimodal |
✅ |
|
| 2 |
Preserving Expert-Level Privacy in Offline Reinforcement Learning |
提出一种共识专家级差分隐私离线强化学习方法,保护专家隐私。 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 3 |
METEOR: Evolutionary Journey of Large Language Models from Guidance to Self-Growth |
提出METEOR方法,引导大语言模型从指导学习到自主进化 |
distillation large language model |
|
|
| 4 |
Dissecting Representation Misalignment in Contrastive Learning via Influence Function |
提出ECIF:通过扩展影响函数解决对比学习中表征错位问题 |
contrastive learning multimodal |
|
|
| 5 |
EXCON: Extreme Instance-based Contrastive Representation Learning of Severely Imbalanced Multivariate Time Series for Solar Flare Prediction |
EXCON:基于极端实例对比学习的太阳耀斑预测方法,解决严重不平衡多元时间序列问题 |
predictive model representation learning contrastive learning |
|
|
| 6 |
Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework |
构建人类反馈强化学习空间:提出概念框架以统一反馈类型和质量评估。 |
reinforcement learning RLHF |
|
|
| 7 |
Theoretical Corrections and the Leveraging of Reinforcement Learning to Enhance Triangle Attack |
提出基于强化学习的三角攻击TARL,提升黑盒对抗攻击效率。 |
reinforcement learning |
|
|
| 8 |
Robust Reinforcement Learning under Diffusion Models for Data with Jumps |
提出MSBVE算法,增强强化学习在跳跃扩散模型下的鲁棒性与收敛性 |
reinforcement learning |
|
|
| 9 |
Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets |
Value Imprint:一种审计RLHF数据集中嵌入人类价值观的技术 |
RLHF |
|
|
| 10 |
Near-Optimal Reinforcement Learning with Shuffle Differential Privacy |
提出SDP-PE算法,在Shuffle差分隐私下实现近优强化学习,解决网络系统隐私泄露问题。 |
reinforcement learning |
|
|
| 11 |
Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning |
提出基于时序高斯混合模型的结构学习方法,用于模型驱动的强化学习。 |
reinforcement learning |
|
|
| 12 |
Continual Task Learning through Adaptive Policy Self-Composition |
提出CompoFormer,通过自适应策略组合解决离线持续强化学习中的灾难性遗忘问题 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 13 |
Aligning Few-Step Diffusion Models with Dense Reward Difference Learning |
提出SDPO,通过密集奖励差异学习对齐少步扩散模型,提升步泛化能力 |
reinforcement learning diffusion policy |
✅ |
|
| 14 |
Reinforced Symbolic Learning with Logical Constraints for Predicting Turbine Blade Fatigue Life |
提出基于强化学习的符号学习方法RSL,用于预测涡轮叶片疲劳寿命 |
reinforcement learning deep reinforcement learning |
|
|