cs.LG（2026-03-18）

📊 共 21 篇论文

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (11) 支柱九：具身大模型 (Embodied Foundation Models) (7) 支柱八：物理动画 (Physics-based Animation) (2) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Large-Scale 3D Ground-Motion Synthesis with Physics-Inspired Latent Operator Flow Matching	提出GMFlow，利用物理启发的潜在算子流匹配实现大规模3D地面运动合成。	flow matching motion synthesis spatiotemporal
2	Federated Distributional Reinforcement Learning with Distributional Critic Regularization	提出TR-FedDistRL，解决联邦强化学习中值函数平均导致的安全问题。	reinforcement learning multimodal
3	Flow Matching Policy with Entropy Regularization	提出流匹配策略与熵正则化以解决强化学习中的探索问题	reinforcement learning diffusion policy flow matching
4	Efficient Exploration at Scale	提出一种高效在线强化学习算法，利用少量人工反馈数据显著提升LLM性能。	reinforcement learning offline RL RLHF
5	Atomic Trajectory Modeling with State Space Models for Biomolecular Dynamics	提出ATMOS，基于状态空间模型生成生物分子动力学的原子级轨迹，加速药物发现。	SSM state space model
6	Efficient Soft Actor-Critic with LLM-Based Action-Level Guidance for Continuous Control	提出GuidedSAC以解决连续控制中的高效探索问题	reinforcement learning SAC large language model
7	DSS-GAN: Directional State Space GAN with Mamba backbone for Class-Conditional Image Synthesis	DSS-GAN：首个采用Mamba骨干网络的条件图像生成对抗网络，提升图像合成质量。	Mamba
8	Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies	提出基于随机逆最优性的强化学习基准测试框架，生成已知最优策略的系统。	reinforcement learning
9	Complementary Reinforcement Learning	提出互补强化学习，解决Agent在稀疏奖励下经验利用不足的问题	reinforcement learning
10	Causal Representation Learning on High-Dimensional Data: Benchmarks, Reproducibility, and Evaluation Metrics	针对因果表示学习，提出基准测试、可复现性分析及综合评估指标。	representation learning
11	Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs	提出基于算子理论的策略梯度方法，解决一般MDP中无界代价问题	reinforcement learning PPO

🔬 支柱九：具身大模型 (Embodied Foundation Models) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
12	Only relative ranks matter in weight-clustered large language models	提出基于相对权重排序的LLM压缩方法，无需训练即可显著降低模型大小。	large language model
13	Discovering Decoupled Functional Modules in Large Language Models	提出ULCMOD框架，用于无监督地发现大语言模型中解耦的功能模块。	large language model
14	FoMo X: Modular Explainability Signals for Outlier Detection Foundation Models	FoMo-X：为异常检测基础模型提供模块化可解释性信号	foundation model
15	RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference	提出RAMP以解决大语言模型量化效率问题	large language model zero-shot transfer
16	Deploying Semantic ID-based Generative Retrieval for Large-Scale Podcast Discovery at Spotify	Spotify提出GLIDE，利用语义ID生成式检索实现大规模播客发现，显著提升用户探索体验。	large language model instruction following
17	Embedding World Knowledge into Tabular Models: Towards Best Practices for Embedding Pipeline Design	针对表格数据预测，系统性评估LLM嵌入流水线设计，提供最佳实践。	large language model
18	ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression	ZipServ：通过硬件感知无损压缩加速LLM推理并降低内存占用	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
19	RHYME-XT: A Neural Operator for Spatiotemporal Control Systems	RHYME-XT：用于时空控制系统的神经算子学习框架	spatiotemporal
20	Data-driven model order reduction for structures with piecewise linear nonlinearity using dynamic mode decomposition	提出基于动态模态分解的数据驱动降阶模型方法，用于求解分段线性非线性结构动力学问题。	PULSE

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	Unified Policy Value Decomposition for Rapid Adaptation	提出统一策略值分解框架，实现复杂控制系统中的快速适应。	locomotion reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页