cs.LG（2026-03-20）

📊 共 17 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (8) 支柱二：RL算法与架构 (RL & Architecture) (7 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Wearable Foundation Models Should Go Beyond Static Encoders	提出可穿戴设备基础模型新范式，突破静态编码器局限，实现长期健康推理。	foundation model multimodal
2	Scalable Cross-Facility Federated Learning for Scientific Foundation Models on Multiple Supercomputers	提出跨多超算中心联邦学习框架，用于训练科学基础模型	large language model foundation model
3	Revisiting Gene Ontology Knowledge Discovery with Hierarchical Feature Selection and Virtual Study Group of AI Agents	提出基于Agentic AI的虚拟研究小组，用于基因本体知识发现，并结合分层特征选择。	large language model
4	Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents	Memori：面向高效、上下文感知LLM Agent的持久性内存层	large language model
5	Eye Gaze-Informed and Context-Aware Pedestrian Trajectory Prediction in Shared Spaces with Automated Shuttles: A Virtual Reality Study	提出GazeX-LSTM模型，利用眼动追踪和上下文信息提升共享空间中行人轨迹预测精度。	multimodal
6	Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation	提出双路径归因(DPA)，高效实现SwiGLU-Transformer的层级目标传播归因	large language model
7	GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems	GoAgent：面向LLM多智能体系统的基于群体通信拓扑生成方法	large language model
8	Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL	对ICL的理论分析：演示、CoT和Prompting的影响	chain-of-thought

🔬 支柱二：RL算法与架构 (RL & Architecture) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
9	Structured Latent Dynamics in Wireless CSI via Homomorphic World Models	提出基于同态世界模型的无线CSI结构化潜在动态学习框架	world model latent dynamics scene understanding
10	FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment	提出FedPDPO，解决联邦学习中大语言模型个性化偏好对齐问题	reinforcement learning RLHF DPO
11	What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test Time	提出SCRL，通过选择性互补强化学习解决测试时推理中弱共识下的标签噪声问题。	reinforcement learning large language model	✅
12	FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization	FIPO：通过未来KL散度影响的策略优化，激发大语言模型的深度推理能力	reinforcement learning large language model chain-of-thought
13	DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management	DeepStock：通过策略正则化强化学习优化库存管理	reinforcement learning deep reinforcement learning DRL
14	Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States	通过重新引入马尔可夫状态突破LLM后训练能力瓶颈	reinforcement learning large language model
15	Learning to Bet for Horizon-Aware Anytime-Valid Testing	提出基于深度强化学习的时限感知测试方法以优化投注策略	reinforcement learning deep reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
16	NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing	NASimJax：用于渗透测试的GPU加速策略学习框架	domain randomization reinforcement learning policy learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
17	How Out-of-Equilibrium Phase Transitions can Seed Pattern Formation in Trained Diffusion Models	将扩散模型生成过程解释为非平衡相变，揭示模式形成机制并提升生成控制。	classifier-free guidance PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页