cs.LG(2025-02-26)

📊 共 29 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (12 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (12 🔗2) 支柱八:物理动画 (Physics-based Animation) (3) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
1 CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation CodeIF:首个面向代码生成任务,评估大语言模型指令遵循能力的基准评测。 large language model instruction following
2 DreamNet: A Multimodal Framework for Semantic and Emotional Analysis of Sleep Narratives DreamNet:用于睡眠叙事语义和情感分析的多模态框架 multimodal
3 M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance M2-omni:一种具有竞争力的全模态多模态大语言模型,可媲美GPT-4o large language model multimodal
4 TRIX: A More Expressive Model for Zero-shot Domain Transfer in Knowledge Graphs TRIX:一种更具表达力的知识图谱零样本领域迁移模型 foundation model
5 General Intelligence Requires Reward-based Pretraining 提出基于奖励的预训练方法,解耦知识与推理,提升LLM的通用智能 large language model
6 dCMF: Learning interpretable evolving patterns from temporal multiway data 提出dCMF模型,结合动态系统与张量分解,用于时序多维数据可解释模式学习。 TAMP
7 Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond 提出基于梯度效应的LLM知识遗忘分析框架,并改进遗忘目标 large language model
8 Efficient Federated Search for Retrieval-Augmented Generation 提出RAGRoute,用于高效的联邦RAG检索增强生成。 large language model
9 Enhancing Gradient-based Discrete Sampling via Parallel Tempering 提出基于并行回火的离散 Langevin 采样方法,提升复杂分布采样效率 multimodal
10 Starjob: Dataset for LLM-Driven Job Shop Scheduling 提出Starjob数据集,利用LLM解决Job Shop调度问题 large language model
11 The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training 揭示Transformer块间Sharpness Disparity,提出Blockwise LR加速大语言模型预训练。 large language model
12 (Mis)Fitting: A Survey of Scaling Laws 揭示缩放定律拟合中的偏差:一项关于缩放定律的综述研究 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
13 Reward Shaping to Mitigate Reward Hacking in RLHF 提出Preference As Reward (PAR)方法,缓解RLHF中的奖励利用问题,提升对齐效果。 reinforcement learning RLHF reward shaping
14 Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models 利用大型语言模型揭示临床机器学习中治疗不依从性偏差 predictive model large language model
15 Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset ReDOR:通过缩减数据集提升离线强化学习性能与效率 reinforcement learning offline RL offline reinforcement learning
16 Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning 提出基于深度强化学习的边缘-云协同资源调度优化方法 reinforcement learning deep reinforcement learning DRL
17 Recurrent Auto-Encoders for Enhanced Deep Reinforcement Learning in Wilderness Search and Rescue Planning 提出基于循环自编码器增强的深度强化学习方法,用于提升野外搜救规划效率。 reinforcement learning deep reinforcement learning
18 MCLRL: A Multi-Domain Contrastive Learning with Reinforcement Learning Framework for Few-Shot Modulation Recognition 提出MCLRL框架,结合多域对比学习与强化学习,解决调制识别中的少样本学习问题。 reinforcement learning contrastive learning
19 Improving Representation Learning of Complex Critical Care Data with ICU-BERT ICU-BERT:利用多任务Transformer学习ICU复杂数据的鲁棒表征,提升临床决策支持。 representation learning large language model
20 Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective 提出TPO算法,利用不完善奖励模型提升在线RLHF的样本效率 reinforcement learning RLHF DPO
21 Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning 提出DICP,通过上下文学习进行模型预测,提升强化学习效率。 reinforcement learning model-based RL distillation
22 Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching 提出基于语义对齐的零样本迁移强化学习方法,实现跨视觉和任务域的策略复用 reinforcement learning deep reinforcement learning
23 Generalizable deep learning for photoplethysmography-based blood pressure estimation -- A Benchmarking Study 基于PPG的血压估计深度学习模型泛化性研究:基准测试与领域自适应 MAE PULSE
24 Global Graph Propagation with Hierarchical Information Transfer for Incomplete Contrastive Multi-view Clustering 提出一种基于层级信息传递的全局图传播不完全对比多视图聚类方法 representation learning contrastive learning

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
25 CryptoPulse: Short-Term Cryptocurrency Forecasting with Dual-Prediction and Cross-Correlated Market Indicators 提出CryptoPulse,利用双重预测和交叉相关市场指标进行加密货币短期预测。 PULSE
26 A HEART for the environment: Transformer-Based Spatiotemporal Modeling for Air Quality Prediction 提出基于Transformer的时空模型HEART,提升空气质量预测精度。 spatiotemporal
27 BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting 提出BeamVQ,利用向量量化和Beam Search缓解物理时空预测中的数据稀缺问题 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
28 Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data 利用非结构化数据引导通用世界模型,提升强化学习效率 locomotion manipulation reinforcement learning
29 Invariance Pair-Guided Learning: Enhancing Robustness in Neural Networks 提出不变性对引导学习,提升神经网络在分布外泛化中的鲁棒性 manipulation representation learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页