cs.LG(2025-03-07)

📊 共 25 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗1) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱八:物理动画 (Physics-based Animation) (2 🔗1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning 提出R1-Omni,利用强化学习提升Omni-多模态情感识别的性能与可解释性。 reinforcement learning large language model multimodal
2 On a Connection Between Imitation Learning and RLHF 提出DIL框架,从模仿学习视角统一理解并优化人类反馈强化学习(RLHF)对齐。 reinforcement learning imitation learning RLHF
3 Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning Impoola-CNN:利用平均池化提升图像深度强化学习性能 reinforcement learning deep reinforcement learning
4 Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design 研究强化学习算法在序贯实验设计中的性能,并探索泛化能力 reinforcement learning
5 Spatial Distillation based Distribution Alignment (SDDA) for Cross-Headset EEG Classification 提出基于空间蒸馏的分布对齐方法SDDA,解决跨脑电设备脑电信号分类难题 distillation
6 Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning 提出APPO算法,解决离线偏好强化学习中的保守性难题,实现高效策略优化。 reinforcement learning
7 Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration 提出SSDE,通过细粒度稀疏网络分配和休眠神经元探索,解决持续强化学习中的灾难性遗忘问题。 reinforcement learning
8 Multi-Task Reinforcement Learning Enables Parameter Scaling 多任务强化学习通过参数扩展实现性能提升,挑战复杂架构的必要性 reinforcement learning
9 Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts 提出Linear-MoE,结合线性序列建模与混合专家模型,高效训练大规模模型。 state space model linear attention
10 Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation 提出基于转移估计的深度强化学习OOD检测方法,保障部署可靠性 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
11 Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance 提出EMDM加权指标,提升大语言模型在饱和基准测试中的性能区分度 large language model chain-of-thought
12 A Real-time Multimodal Transformer Neural Network-powered Wildfire Forecasting System 提出一种基于多模态Transformer神经网络的实时野火预测系统 multimodal
13 A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models 综述论文:稀疏自编码器用于理解大型语言模型的内部机制 large language model
14 MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration MergeQuant:通过通道校准实现大语言模型精确的4比特静态量化 large language model
15 Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts 提出容量感知推理方法,缓解混合专家模型中的Straggler效应。 large language model multimodal
16 Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints 提出基于LoRA的公平性感知微调方法,解决人口隐私约束下的模型偏见问题 foundation model
17 SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs SplitQuantV2:无需GPU加速,提升LLM低比特量化精度 large language model
18 Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs 提出基于Steering Vectors的偏差缓解方法,提升LLM的公平性和安全性。 large language model
19 Routing for Large ML Models 针对大规模ML模型训练,提出网络路由优化框架以提升通信效率 large language model
20 Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Ling团队提出低成本MoE大语言模型,在消费级硬件上训练300B参数模型。 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
21 Policy Constraint by Only Support Constraint for Offline Reinforcement Learning 提出仅支持约束(OSC)的离线强化学习策略约束方法,缓解保守性问题。 OSC reinforcement learning offline RL
22 A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation 提出多置信度控制变量策略梯度方法,提升强化学习在计算密集型任务中的效率。 sim-to-real reinforcement learning world model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
23 Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention 提出决策意识训练方法以优化干预地点选择 spatiotemporal
24 TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting 提出TS-LIF模型,用于提升SNN在时间序列预测中的精度与鲁棒性 spatiotemporal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
25 Riemann$^2$: Learning Riemannian Submanifolds from Riemannian Data 提出Riemann$^2$,学习黎曼流形上的黎曼子流形表示 motion synthesis

⬅️ 返回 cs.LG 首页 · 🏠 返回主页