| 1 |
Promoting cross-modal representations to improve multimodal foundation models for physiological signals |
提出基于跨模态表征增强的多模态生理信号预训练模型,提升医疗健康应用性能。 |
contrastive learning foundation model multimodal |
|
|
| 2 |
Pruning Foundation Models for High Accuracy without Retraining |
提出APT免训练剪枝算法,实现大模型高精度压缩与加速 |
Mamba large language model foundation model |
✅ |
|
| 3 |
Offline reinforcement learning for job-shop scheduling problems |
提出一种离线强化学习方法,用于解决Job-Shop调度问题。 |
reinforcement learning deep reinforcement learning offline RL |
|
|
| 4 |
On The Global Convergence Of Online RLHF With Neural Parametrization |
提出双层优化框架以解决RLHF中的分布偏移问题 |
reinforcement learning policy learning RLHF |
|
|
| 5 |
Understanding and Alleviating Memory Consumption in RLHF for LLMs |
针对LLM的RLHF微调,提出内存优化方法以降低资源消耗 |
reinforcement learning RLHF large language model |
|
|
| 6 |
In-Trajectory Inverse Reinforcement Learning: Learn Incrementally Before An Ongoing Trajectory Terminates |
提出在线轨迹逆强化学习以解决增量学习问题 |
reinforcement learning inverse reinforcement learning |
|
|
| 7 |
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces |
提出VQ-CD,通过对齐空间上的选择性权重激活解决持续离线强化学习问题 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 8 |
Do Audio-Language Models Understand Linguistic Variations? |
提出RobustCLAP,增强音频语言模型对文本查询中语言变体的泛化能力 |
contrastive learning open-vocabulary open vocabulary |
|
|
| 9 |
A Plug-and-Play Fully On-the-Job Real-Time Reinforcement Learning Algorithm for a Direct-Drive Tandem-Wing Experiment Platforms Under Multiple Random Operating Conditions |
针对串联翼飞行器,提出一种即插即用、全流程实时强化学习算法CRL2E,解决多重随机工况下的运动控制难题。 |
reinforcement learning |
|
|
| 10 |
RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space |
RGMDT:非欧度量空间中基于回报差距最小化的决策树提取方法 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 11 |
Modeling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting |
研究受限玻尔兹曼机在师生框架下的结构化数据学习能力 |
teacher-student |
|
|
| 12 |
Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality |
基于对偶性的强化学习信息论Minimax遗憾界 |
reinforcement learning |
|
|
| 13 |
Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning |
提出基于点互信息加权的模仿学习方法,用于恢复多样化策略。 |
imitation learning |
|
|
| 14 |
Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples |
提出基于知识蒸馏的模型模仿攻击,提升黑盒对抗样本的可迁移性并提供理论保证 |
distillation |
|
|