cs.LG（2026-04-22）

📊 共 21 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (10 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (8) 支柱一：机器人控制 (Robot Control) (2 🔗1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring	提出自适应共形异常检测方法，利用时序预训练模型进行信号监控。	foundation model
2	CHASM: Unveiling Covert Advertisements on Chinese Social Media	提出CHASM数据集，用于评估多模态大语言模型在中文社交媒体隐蔽广告检测中的能力	large language model multimodal
3	R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling	R2IF：通过复合奖励对齐推理与决策，实现可解释的LLM函数调用	large language model chain-of-thought
4	Stream-CQSA: Avoiding Out-of-Memory in Attention Computation via Flexible Workload Scheduling	Stream-CQSA：通过灵活的工作负载调度避免Attention计算中的内存溢出	large language model
5	Supplement Generation Training for Enhancing Agentic Task Performance	提出补充生成训练(SGT)，提升Agent在任务中的表现，降低大模型训练成本。	foundation model
6	COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling	COMPASS：自适应语义抽样的持续多语言PEFT，提升LLM跨语言性能。	large language model
7	Evaluating Assurance Cases as Text-Attributed Graphs for Structure and Provenance Analysis	提出基于图诊断框架的保证案例结构与溯源分析方法	large language model
8	Towards Event-Aware Forecasting in DeFi: Insights from On-chain Automated Market Maker Protocols	提出UWM损失函数，提升DeFi中AMM事件驱动预测的准确性和时序性。	TAMP	✅
9	Differentiable Conformal Training for LLM Reasoning Factuality	提出可微一致性训练(DCF)，提升LLM推理事实性并保持可靠性保证。	large language model
10	On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks	研究扩散语言模型在代码生成任务中量化鲁棒性，发现其优于自回归模型。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
11	MGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM Alignment	提出MGDA-Decoupled算法，用于DPO对齐中兼顾多目标优化	reinforcement learning DPO large language model
12	Maximum Entropy Semi-Supervised Inverse Reinforcement Learning	提出MESSI算法，结合最大熵逆强化学习与半监督学习，提升学徒学习效果	reinforcement learning inverse reinforcement learning
13	Tokenised Flow Matching for Hierarchical Simulation Based Inference	提出Tokenised Flow Matching (TFMPE)方法，加速分层模拟推断并降低计算成本。	flow matching
14	GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning	提出GRPO-VPS，通过可验证的过程监督增强群体相对策略优化，提升LLM的推理能力。	reinforcement learning large language model
15	Sheaf Neural Networks on SPD Manifolds: Second-Order Geometric Representation Learning	提出基于SPD流形上的Sheaf神经网络，用于二阶几何表示学习。	representation learning
16	ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control	ParetoSlider：通过扩散模型后训练实现连续奖励控制	reinforcement learning flow matching
17	An explicit operator explains end-to-end computation in the modern neural networks used for sequence and language modeling	通过显式算子解释序列和语言建模神经网络的端到端计算	SSM state space model
18	Temporally Extended Mixture-of-Experts Models	提出时序扩展混合专家模型以解决GPU内存限制问题	reinforcement learning distillation

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning	提出Occupancy Reward Shaping，改善离线目标条件强化学习中的信用分配问题	locomotion manipulation reinforcement learning	✅
20	Distributional Value Estimation Without Target Networks for Robust Quality-Diversity	QDHUAC：一种无目标网络的分布价值估计方法，用于提升质量多样性算法的鲁棒性	locomotion reinforcement learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	Physics-Conditioned Synthesis of Internal Ice-Layer Thickness for Incomplete Layer Traces	提出物理条件约束的冰层厚度合成方法，补全雷达图像中不完整的冰层信息	physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页