cs.LG(2026-03-18)

📊 共 21 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (7) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Large-Scale 3D Ground-Motion Synthesis with Physics-Inspired Latent Operator Flow Matching 提出GMFlow,利用物理启发的潜在算子流匹配实现大规模3D地面运动合成。 flow matching motion synthesis spatiotemporal
2 Federated Distributional Reinforcement Learning with Distributional Critic Regularization 提出TR-FedDistRL,解决联邦强化学习中值函数平均导致的安全问题。 reinforcement learning multimodal
3 Flow Matching Policy with Entropy Regularization 提出流匹配策略与熵正则化以解决强化学习中的探索问题 reinforcement learning diffusion policy flow matching
4 Efficient Exploration at Scale 提出一种高效在线强化学习算法,利用少量人工反馈数据显著提升LLM性能。 reinforcement learning offline RL RLHF
5 Atomic Trajectory Modeling with State Space Models for Biomolecular Dynamics 提出ATMOS,基于状态空间模型生成生物分子动力学的原子级轨迹,加速药物发现。 SSM state space model
6 Efficient Soft Actor-Critic with LLM-Based Action-Level Guidance for Continuous Control 提出GuidedSAC以解决连续控制中的高效探索问题 reinforcement learning SAC large language model
7 DSS-GAN: Directional State Space GAN with Mamba backbone for Class-Conditional Image Synthesis DSS-GAN:首个采用Mamba骨干网络的条件图像生成对抗网络,提升图像合成质量。 Mamba
8 Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies 提出基于随机逆最优性的强化学习基准测试框架,生成已知最优策略的系统。 reinforcement learning
9 Complementary Reinforcement Learning 提出互补强化学习,解决Agent在稀疏奖励下经验利用不足的问题 reinforcement learning
10 Causal Representation Learning on High-Dimensional Data: Benchmarks, Reproducibility, and Evaluation Metrics 针对因果表示学习,提出基准测试、可复现性分析及综合评估指标。 representation learning
11 Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs 提出基于算子理论的策略梯度方法,解决一般MDP中无界代价问题 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
12 Only relative ranks matter in weight-clustered large language models 提出基于相对权重排序的LLM压缩方法,无需训练即可显著降低模型大小。 large language model
13 Discovering Decoupled Functional Modules in Large Language Models 提出ULCMOD框架,用于无监督地发现大语言模型中解耦的功能模块。 large language model
14 FoMo X: Modular Explainability Signals for Outlier Detection Foundation Models FoMo-X:为异常检测基础模型提供模块化可解释性信号 foundation model
15 RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference 提出RAMP以解决大语言模型量化效率问题 large language model zero-shot transfer
16 Deploying Semantic ID-based Generative Retrieval for Large-Scale Podcast Discovery at Spotify Spotify提出GLIDE,利用语义ID生成式检索实现大规模播客发现,显著提升用户探索体验。 large language model instruction following
17 Embedding World Knowledge into Tabular Models: Towards Best Practices for Embedding Pipeline Design 针对表格数据预测,系统性评估LLM嵌入流水线设计,提供最佳实践。 large language model
18 ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression ZipServ:通过硬件感知无损压缩加速LLM推理并降低内存占用 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
19 RHYME-XT: A Neural Operator for Spatiotemporal Control Systems RHYME-XT:用于时空控制系统的神经算子学习框架 spatiotemporal
20 Data-driven model order reduction for structures with piecewise linear nonlinearity using dynamic mode decomposition 提出基于动态模态分解的数据驱动降阶模型方法,用于求解分段线性非线性结构动力学问题。 PULSE

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
21 Unified Policy Value Decomposition for Rapid Adaptation 提出统一策略值分解框架,实现复杂控制系统中的快速适应。 locomotion reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页