cs.LG（2025-02-26）

📊 共 29 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (12 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (12 🔗2) 支柱八：物理动画 (Physics-based Animation) (3) 支柱一：机器人控制 (Robot Control) (2)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation	CodeIF：首个面向代码生成任务，评估大语言模型指令遵循能力的基准评测。	large language model instruction following	✅
2	DreamNet: A Multimodal Framework for Semantic and Emotional Analysis of Sleep Narratives	DreamNet：用于睡眠叙事语义和情感分析的多模态框架	multimodal
3	M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance	M2-omni：一种具有竞争力的全模态多模态大语言模型，可媲美GPT-4o	large language model multimodal
4	TRIX: A More Expressive Model for Zero-shot Domain Transfer in Knowledge Graphs	TRIX：一种更具表达力的知识图谱零样本领域迁移模型	foundation model	✅
5	General Intelligence Requires Reward-based Pretraining	提出基于奖励的预训练方法，解耦知识与推理，提升LLM的通用智能	large language model
6	dCMF: Learning interpretable evolving patterns from temporal multiway data	提出dCMF模型，结合动态系统与张量分解，用于时序多维数据可解释模式学习。	TAMP
7	Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond	提出基于梯度效应的LLM知识遗忘分析框架，并改进遗忘目标	large language model
8	Efficient Federated Search for Retrieval-Augmented Generation	提出RAGRoute，用于高效的联邦RAG检索增强生成。	large language model
9	Enhancing Gradient-based Discrete Sampling via Parallel Tempering	提出基于并行回火的离散 Langevin 采样方法，提升复杂分布采样效率	multimodal
10	Starjob: Dataset for LLM-Driven Job Shop Scheduling	提出Starjob数据集，利用LLM解决Job Shop调度问题	large language model
11	The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training	揭示Transformer块间Sharpness Disparity，提出Blockwise LR加速大语言模型预训练。	large language model
12	(Mis)Fitting: A Survey of Scaling Laws	揭示缩放定律拟合中的偏差：一项关于缩放定律的综述研究	foundation model

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
13	Reward Shaping to Mitigate Reward Hacking in RLHF	提出Preference As Reward (PAR)方法，缓解RLHF中的奖励利用问题，提升对齐效果。	reinforcement learning RLHF reward shaping	✅
14	Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models	利用大型语言模型揭示临床机器学习中治疗不依从性偏差	predictive model large language model
15	Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset	ReDOR：通过缩减数据集提升离线强化学习性能与效率	reinforcement learning offline RL offline reinforcement learning
16	Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning	提出基于深度强化学习的边缘-云协同资源调度优化方法	reinforcement learning deep reinforcement learning DRL
17	Recurrent Auto-Encoders for Enhanced Deep Reinforcement Learning in Wilderness Search and Rescue Planning	提出基于循环自编码器增强的深度强化学习方法，用于提升野外搜救规划效率。	reinforcement learning deep reinforcement learning
18	MCLRL: A Multi-Domain Contrastive Learning with Reinforcement Learning Framework for Few-Shot Modulation Recognition	提出MCLRL框架，结合多域对比学习与强化学习，解决调制识别中的少样本学习问题。	reinforcement learning contrastive learning
19	Improving Representation Learning of Complex Critical Care Data with ICU-BERT	ICU-BERT：利用多任务Transformer学习ICU复杂数据的鲁棒表征，提升临床决策支持。	representation learning large language model
20	Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective	提出TPO算法，利用不完善奖励模型提升在线RLHF的样本效率	reinforcement learning RLHF DPO
21	Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning	提出DICP，通过上下文学习进行模型预测，提升强化学习效率。	reinforcement learning model-based RL distillation
22	Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching	提出基于语义对齐的零样本迁移强化学习方法，实现跨视觉和任务域的策略复用	reinforcement learning deep reinforcement learning
23	Generalizable deep learning for photoplethysmography-based blood pressure estimation -- A Benchmarking Study	基于PPG的血压估计深度学习模型泛化性研究：基准测试与领域自适应	MAE PULSE
24	Global Graph Propagation with Hierarchical Information Transfer for Incomplete Contrastive Multi-view Clustering	提出一种基于层级信息传递的全局图传播不完全对比多视图聚类方法	representation learning contrastive learning	✅

🔬 支柱八：物理动画 (Physics-based Animation) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
25	CryptoPulse: Short-Term Cryptocurrency Forecasting with Dual-Prediction and Cross-Correlated Market Indicators	提出CryptoPulse，利用双重预测和交叉相关市场指标进行加密货币短期预测。	PULSE
26	A HEART for the environment: Transformer-Based Spatiotemporal Modeling for Air Quality Prediction	提出基于Transformer的时空模型HEART，提升空气质量预测精度。	spatiotemporal
27	BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting	提出BeamVQ，利用向量量化和Beam Search缓解物理时空预测中的数据稀缺问题	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
28	Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data	利用非结构化数据引导通用世界模型，提升强化学习效率	locomotion manipulation reinforcement learning
29	Invariance Pair-Guided Learning: Enhancing Robustness in Neural Networks	提出不变性对引导学习，提升神经网络在分布外泛化中的鲁棒性	manipulation representation learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页