cs.LG(2026-03-20)

📊 共 17 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (8) 支柱二:RL算法与架构 (RL & Architecture) (7 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
1 Wearable Foundation Models Should Go Beyond Static Encoders 提出可穿戴设备基础模型新范式,突破静态编码器局限,实现长期健康推理。 foundation model multimodal
2 Scalable Cross-Facility Federated Learning for Scientific Foundation Models on Multiple Supercomputers 提出跨多超算中心联邦学习框架,用于训练科学基础模型 large language model foundation model
3 Revisiting Gene Ontology Knowledge Discovery with Hierarchical Feature Selection and Virtual Study Group of AI Agents 提出基于Agentic AI的虚拟研究小组,用于基因本体知识发现,并结合分层特征选择。 large language model
4 Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents Memori:面向高效、上下文感知LLM Agent的持久性内存层 large language model
5 Eye Gaze-Informed and Context-Aware Pedestrian Trajectory Prediction in Shared Spaces with Automated Shuttles: A Virtual Reality Study 提出GazeX-LSTM模型,利用眼动追踪和上下文信息提升共享空间中行人轨迹预测精度。 multimodal
6 Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation 提出双路径归因(DPA),高效实现SwiGLU-Transformer的层级目标传播归因 large language model
7 GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems GoAgent:面向LLM多智能体系统的基于群体通信拓扑生成方法 large language model
8 Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL 对ICL的理论分析:演示、CoT和Prompting的影响 chain-of-thought

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
9 Structured Latent Dynamics in Wireless CSI via Homomorphic World Models 提出基于同态世界模型的无线CSI结构化潜在动态学习框架 world model latent dynamics scene understanding
10 FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment 提出FedPDPO,解决联邦学习中大语言模型个性化偏好对齐问题 reinforcement learning RLHF DPO
11 What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test Time 提出SCRL,通过选择性互补强化学习解决测试时推理中弱共识下的标签噪声问题。 reinforcement learning large language model
12 FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization FIPO:通过未来KL散度影响的策略优化,激发大语言模型的深度推理能力 reinforcement learning large language model chain-of-thought
13 DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management DeepStock:通过策略正则化强化学习优化库存管理 reinforcement learning deep reinforcement learning DRL
14 Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States 通过重新引入马尔可夫状态突破LLM后训练能力瓶颈 reinforcement learning large language model
15 Learning to Bet for Horizon-Aware Anytime-Valid Testing 提出基于深度强化学习的时限感知测试方法以优化投注策略 reinforcement learning deep reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
16 NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing NASimJax:用于渗透测试的GPU加速策略学习框架 domain randomization reinforcement learning policy learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
17 How Out-of-Equilibrium Phase Transitions can Seed Pattern Formation in Trained Diffusion Models 将扩散模型生成过程解释为非平衡相变,揭示模式形成机制并提升生成控制。 classifier-free guidance PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页