cs.LG(2025-03-20)

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (13 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (7) 支柱八:物理动画 (Physics-based Animation) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
1 OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning OThink-MR1:通过动态强化学习激发多模态通用推理能力 reinforcement learning large language model multimodal
2 Active management of battery degradation in wireless sensor network using deep reinforcement learning for group battery replacement 提出基于深度强化学习的无线传感器网络电池主动管理方法,实现分组电池更换。 reinforcement learning deep reinforcement learning DRL
3 Advances in Protein Representation Learning: Methods, Applications, and Future Directions 综述蛋白质表示学习进展,为分子生物学、医学研究和药物发现提供新视角。 representation learning multimodal
4 Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't 利用强化学习提升小规模LLM的推理能力,兼顾效果与成本。 reinforcement learning large language model
5 Network-wide Freeway Traffic Estimation Using Sparse Sensor Data: A Dirichlet Graph Auto-Encoder Approach 提出DGAE模型,利用稀疏传感器数据实现全路网交通状态估计,提升跨城市迁移能力。 representation learning sparse sensors
6 Utilizing Reinforcement Learning for Bottom-Up part-wise Reconstruction of 2D Wire-Frame Projections 提出基于强化学习的自底向上零件式二维线框投影重建方法 reinforcement learning curriculum learning
7 Nonparametric Bellman Mappings for Value Iteration in Distributed Reinforcement Learning 提出非参数贝尔曼映射,用于分布式强化学习中的值迭代。 reinforcement learning DRL
8 InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization InCo-DPO:平衡分布偏移与数据质量,提升偏好优化效果 DPO direct preference optimization
9 Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement 提出基于混合块替换的高效ANN引导SNN蒸馏训练框架 distillation
10 Denoising-based Contractive Imitation Learning 提出基于去噪的收缩模仿学习以解决协变量偏移问题 imitation learning
11 Bezier Distillation 提出Bezier蒸馏方法,结合多教师知识蒸馏与Bezier曲线,解决Rectified Flow中的误差累积问题。 distillation
12 Disentangling Uncertainties by Learning Compressed Data Representation 提出压缩数据表征模型CDRM,用于解耦学习系统动态模型中的不确定性 reinforcement learning multimodal
13 Whenever, Wherever: Towards Orchestrating Crowd Simulations with Spatio-Temporal Spawn Dynamics 提出nTPP-GMM模型,用于人群仿真中时空生成动态的建模与编排。 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
14 Learning Universal Human Mobility Patterns with a Foundation Model for Cross-domain Data Fusion 提出基于大语言模型的通用人类移动模式学习框架,用于跨领域数据融合。 large language model foundation model
15 Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data 利用OpenFlamingo分析C2C汽车零件数据的多模态嵌入 multimodal
16 Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them 提出MisFT方法,通过误导微调探索大语言模型隐藏的推理过程 large language model
17 Gene42: Long-Range Genomic Foundation Model With Dense Attention Gene42:基于密集注意力机制的长程基因组基础模型,处理高达192,000个碱基对 foundation model
18 Accelerating Transformer Inference and Training with 2:4 Activation Sparsity 利用2:4激活稀疏性加速Transformer的推理与训练 large language model
19 A preliminary data fusion study to assess the feasibility of Foundation Process-Property Models in Laser Powder Bed Fusion 针对激光粉末床熔融中数据稀缺问题,提出基于高斯过程的数据融合可行性评估方法。 foundation model
20 DNR Bench: Benchmarking Over-Reasoning in Reasoning LLMs DNR Bench:评估推理LLM过度推理问题的基准测试 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
21 ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos 提出ScalingNoise以解决视频生成中的噪声优化问题 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
22 Transfer learning from first-principles calculations to experiments with chemistry-informed domain transformation 提出化学信息驱动的领域迁移学习,解决实验数据稀缺问题。 sim2real

⬅️ 返回 cs.LG 首页 · 🏠 返回主页