cs.LG（2025-07-21）

📊 共 27 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (12) 支柱二：RL算法与架构 (RL & Architecture) (12 🔗2) 支柱八：物理动画 (Physics-based Animation) (2) 支柱一：机器人控制 (Robot Control) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation	提出JAM框架，用于多模态和个性化的自然语言音乐推荐。	large language model multimodal
2	Applying the Chinese Wall Reverse Engineering Technique to Large Language Model Code Editing	应用“中国墙”逆向工程技术提升代码大语言模型的代码编辑能力	large language model
3	Foundation Models and Transformers for Anomaly Detection: A Survey	综述Transformer与Foundation模型在视觉异常检测中的应用与进展	foundation model
4	FedMultiEmo: Real-Time Emotion Recognition via Multimodal Federated Learning	FedMultiEmo：通过多模态联邦学习实现实时情感识别	multimodal
5	Privacy-Preserving Multimodal News Recommendation through Federated Learning	提出基于联邦学习的多模态新闻推荐方法，解决个性化推荐中的隐私问题。	multimodal
6	Multimodal Fine-grained Reasoning for Post Quality Evaluation	提出MFTRR框架，用于多模态细粒度推理的帖子质量评估。	multimodal
7	Diffusion Beats Autoregressive in Data-Constrained Settings	在数据受限场景下，扩散模型超越自回归模型	large language model
8	HyDRA: A Hybrid-Driven Reasoning Architecture for Verifiable Knowledge Graphs	HyDRA：一种混合驱动的推理架构，用于构建可验证的知识图谱	large language model
9	FASTGEN: Fast and Cost-Effective Synthetic Tabular Data Generation with LLMs	FASTGEN：利用LLM快速且经济高效地生成合成表格数据	large language model
10	Towards Reliable, Uncertainty-Aware Alignment	提出方差感知策略优化框架，提升LLM对齐的稳定性和鲁棒性	large language model
11	PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors	PhysGym：构建可控先验的交互式物理发现LLM基准测试平台	large language model
12	Towards Mitigation of Hallucination for LLM-empowered Agents: Progressive Generalization Bound Exploration and Watchdog Monitor	提出HalMit框架，通过泛化边界探索和监控缓解LLM智能体的幻觉问题	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
13	MSGM: A Multi-Scale Spatiotemporal Graph Mamba for EEG Emotion Recognition	提出MSGM：一种用于脑电情绪识别的多尺度时空图Mamba模型	Mamba spatiotemporal
14	Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning	小规模LLM难以通过强化学习获得可泛化的心理理论能力	reinforcement learning large language model
15	Mixture of Autoencoder Experts Guidance using Unlabeled and Incomplete Data for Exploration in Reinforcement Learning	提出基于自编码专家混合模型的强化学习探索方法，利用非标记和不完整数据指导学习。	reinforcement learning generalist agent
16	Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation	提出长短距离图神经网络和改进课程学习方法，用于提升对话情绪识别性能。	curriculum learning multimodal
17	Automated Design of Structured Variational Quantum Circuits with Reinforcement Learning	提出基于强化学习的自动变分量子电路设计方法，优化组合优化问题。	reinforcement learning PPO
18	Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback	提出Off-Policy修正奖励模型(OCRM)以解决RLHF中的过优化问题	reinforcement learning RLHF	✅
19	To Label or Not to Label: PALM -- A Predictive Model for Evaluating Sample Efficiency in Active Learning Models	提出PALM模型，用于预测主动学习模型在不同标注预算下的样本效率。	predictive model	✅
20	Reinforcement Learning in hyperbolic space for multi-step reasoning	提出基于双曲Transformer的强化学习框架，用于解决多步推理问题	reinforcement learning
21	Minor Embedding for Quantum Annealing with Reinforcement Learning	提出基于强化学习的量子退火次嵌入方法，提升问题规模和硬件拓扑的泛化性	reinforcement learning
22	LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra	LLM Economist：利用多智能体生成模拟环境进行经济政策设计与评估	reinforcement learning large language model
23	Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training	提出数据混合Agent，通过强化学习自动学习领域重加权策略，提升持续预训练效果。	reinforcement learning large language model
24	Red-Team Multi-Agent Reinforcement Learning for Emergency Braking Scenario	提出红队多智能体强化学习框架，用于挖掘紧急制动场景中的极端工况。	reinforcement learning

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Blending data and physics for reduced-order modeling of systems with spatiotemporal chaotic dynamics	提出一种融合数据与物理信息的降阶建模方法，用于时空混沌系统。	spatiotemporal
26	Structural DID with ML: Theory, Simulation, and a Roadmap for Applied Research	提出S-DIDML框架，融合结构化DID与机器学习，解决观测面板数据中的高维混淆问题。	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Joint-Local Grounded Action Transformation for Sim-to-Real Transfer in Multi-Agent Traffic Control	提出JL-GAT，解决多智能体交通控制中Sim-to-Real迁移问题	sim-to-real reinforcement learning	✅

⬅️ 返回 cs.LG 首页 · 🏠 返回主页