cs.LG(2025-06-27)

📊 共 23 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (6 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 The Hidden Link Between RLHF and Contrastive Learning 提出互信息优化方法以提升人类反馈强化学习效果 reinforcement learning RLHF DPO
2 Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion 提出MMFeD3-HidE以解决联邦多模态知识图谱补全问题 distillation multimodal
3 Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting 提出频率对齐知识蒸馏以解决轻量级时空预测问题 MAE distillation spatiotemporal
4 TROFI: Trajectory-Ranked Offline Inverse Reinforcement Learning 提出TROFI以解决离线强化学习中的奖励函数缺失问题 reinforcement learning offline reinforcement learning inverse reinforcement learning
5 EFRame: Deeper Reasoning via Exploration-Filter-Replay Reinforcement Learning Framework 提出EFRame框架以解决GRPO在复杂推理任务中的不足 reinforcement learning PPO large language model
6 Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training 提出层重要性分析以优化数学推理能力 reinforcement learning distillation large language model
7 TOAST: Task-Oriented Adaptive Semantic Transmission over Dynamic Wireless Environments 提出TOAST框架以解决动态无线环境中的多任务优化问题 reinforcement learning deep reinforcement learning PULSE
8 Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation 提出物理信息符号程序先验的强化学习框架以解决零样本室内导航问题 reinforcement learning
9 SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model 提出SceneDiffuser++以解决城市规模交通模拟问题 world model
10 MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs 提出MetaCipher以解决LLMs的低成本多代理越狱攻击问题 reinforcement learning large language model
11 Smooth-Distill: A Self-distillation Framework for Multitask Learning with Wearable Sensor Data 提出Smooth-Distill框架以解决可穿戴传感器数据的多任务学习问题 distillation
12 Advancements and Challenges in Continual Reinforcement Learning: A Comprehensive Review 综述持续强化学习的进展与挑战,推动动态学习能力提升 reinforcement learning
13 A Survey of Continual Reinforcement Learning 提出持续强化学习方法以解决动态环境中的知识保持问题 reinforcement learning
14 Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling 提出基于Koopman算子的生成流展开方法以加速采样 flow matching distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
15 XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science 提出XxaCT-NN以解决材料科学中的结构依赖问题 foundation model multimodal
16 UniCA: Adapting Time Series Foundation Model to General Covariate-Aware Forecasting 提出UniCA以解决时间序列预测中的协变量适应问题 foundation model multimodal
17 Sheaf-Based Decentralized Multimodal Learning for Next-Generation Wireless Communication Systems 提出Sheaf-DMFL以解决多模态数据协作学习问题 multimodal
18 OptScale: Probabilistic Optimality for Inference-time Scaling 提出OptScale以解决推理时间缩放的效率问题 large language model
19 Projected Compression: Trainable Projection for Efficient Transformer Compression 提出Projected Compression以解决Transformer模型压缩问题 large language model
20 GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling 提出GPAS以解决大语言模型预训练中的激活方差问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
21 ARMOR: Robust Reinforcement Learning-based Control for UAVs under Physical Attacks 提出ARMOR以解决无人机在物理攻击下的控制问题 manipulation reinforcement learning privileged information
22 Earthquake Damage Grades Prediction using An Ensemble Approach Integrating Advanced Machine and Deep Learning Models 提出集成先进机器学习与深度学习模型的地震损伤等级预测方法 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
23 Hitchhiking Rides Dataset: Two decades of crowd-sourced records on stochastic traveling 提出搭便车数据集以研究随机旅行现象 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页