cs.LG（2025-03-20）

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (13 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (7) 支柱八：物理动画 (Physics-based Animation) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
1	OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning	OThink-MR1：通过动态强化学习激发多模态通用推理能力	reinforcement learning large language model multimodal
2	Active management of battery degradation in wireless sensor network using deep reinforcement learning for group battery replacement	提出基于深度强化学习的无线传感器网络电池主动管理方法，实现分组电池更换。	reinforcement learning deep reinforcement learning DRL
3	Advances in Protein Representation Learning: Methods, Applications, and Future Directions	综述蛋白质表示学习进展，为分子生物学、医学研究和药物发现提供新视角。	representation learning multimodal
4	Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't	利用强化学习提升小规模LLM的推理能力，兼顾效果与成本。	reinforcement learning large language model	✅
5	Network-wide Freeway Traffic Estimation Using Sparse Sensor Data: A Dirichlet Graph Auto-Encoder Approach	提出DGAE模型，利用稀疏传感器数据实现全路网交通状态估计，提升跨城市迁移能力。	representation learning sparse sensors
6	Utilizing Reinforcement Learning for Bottom-Up part-wise Reconstruction of 2D Wire-Frame Projections	提出基于强化学习的自底向上零件式二维线框投影重建方法	reinforcement learning curriculum learning
7	Nonparametric Bellman Mappings for Value Iteration in Distributed Reinforcement Learning	提出非参数贝尔曼映射，用于分布式强化学习中的值迭代。	reinforcement learning DRL
8	InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization	InCo-DPO：平衡分布偏移与数据质量，提升偏好优化效果	DPO direct preference optimization
9	Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement	提出基于混合块替换的高效ANN引导SNN蒸馏训练框架	distillation
10	Denoising-based Contractive Imitation Learning	提出基于去噪的收缩模仿学习以解决协变量偏移问题	imitation learning
11	Bezier Distillation	提出Bezier蒸馏方法，结合多教师知识蒸馏与Bezier曲线，解决Rectified Flow中的误差累积问题。	distillation
12	Disentangling Uncertainties by Learning Compressed Data Representation	提出压缩数据表征模型CDRM，用于解耦学习系统动态模型中的不确定性	reinforcement learning multimodal	✅
13	Whenever, Wherever: Towards Orchestrating Crowd Simulations with Spatio-Temporal Spawn Dynamics	提出nTPP-GMM模型，用于人群仿真中时空生成动态的建模与编排。	reinforcement learning deep reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
14	Learning Universal Human Mobility Patterns with a Foundation Model for Cross-domain Data Fusion	提出基于大语言模型的通用人类移动模式学习框架，用于跨领域数据融合。	large language model foundation model
15	Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data	利用OpenFlamingo分析C2C汽车零件数据的多模态嵌入	multimodal
16	Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them	提出MisFT方法，通过误导微调探索大语言模型隐藏的推理过程	large language model
17	Gene42: Long-Range Genomic Foundation Model With Dense Attention	Gene42：基于密集注意力机制的长程基因组基础模型，处理高达192,000个碱基对	foundation model
18	Accelerating Transformer Inference and Training with 2:4 Activation Sparsity	利用2:4激活稀疏性加速Transformer的推理与训练	large language model
19	A preliminary data fusion study to assess the feasibility of Foundation Process-Property Models in Laser Powder Bed Fusion	针对激光粉末床熔融中数据稀缺问题，提出基于高斯过程的数据融合可行性评估方法。	foundation model
20	DNR Bench: Benchmarking Over-Reasoning in Reasoning LLMs	DNR Bench：评估推理LLM过度推理问题的基准测试	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos	提出ScalingNoise以解决视频生成中的噪声优化问题	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
22	Transfer learning from first-principles calculations to experiments with chemistry-informed domain transformation	提出化学信息驱动的领域迁移学习，解决实验数据稀缺问题。	sim2real

⬅️ 返回 cs.LG 首页 · 🏠 返回主页