cs.LG（2025-04-07）

📊 共 25 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (10 🔗3) 支柱八：物理动画 (Physics-based Animation) (2) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
1	ACE-RLHF: Automated Code Evaluation and Socratic Feedback Generation Tool using Large Language Models and Reinforcement Learning with Human Feedback	提出ACE-RLHF：利用LLM和RLHF自动生成代码评估与苏格拉底式反馈工具	reinforcement learning RLHF large language model
2	A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization	提出Pairwise-RL，通过统一的成对框架优化RLHF，提升奖励模型校准与策略优化。	reinforcement learning PPO RLHF
3	Attention-Augmented Inverse Reinforcement Learning with Graph Convolutions for Multi-Agent Task Allocation	提出基于注意力机制和图卷积的逆强化学习方法，用于多智能体任务分配	reinforcement learning deep reinforcement learning DRL
4	Efficient Reinforcement Finetuning via Adaptive Curriculum Learning	提出AdaRFT，通过自适应课程学习提升强化微调在数学推理中的效率和准确性	PPO curriculum learning IMoS
5	A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks	提出基于后决策状态和双重Critic网络的PDPPO算法，提升随机环境下强化学习性能	reinforcement learning deep reinforcement learning PPO
6	Bidirectional Hierarchical Protein Multi-Modal Representation Learning	提出双向分层蛋白质多模态表征学习框架，融合序列与结构信息。	representation learning multimodal
7	Gaussian Mixture Flow Matching Models	提出高斯混合流匹配模型(GMFlow)，提升少步采样质量并缓解图像生成中的色彩过饱和问题。	flow matching classifier-free guidance
8	Large-Scale Mixed-Traffic and Intersection Control using Multi-agent Reinforcement Learning	提出基于多智能体强化学习的大规模混合交通路口控制方法	reinforcement learning penetration
9	The Role of Environment Access in Agnostic Reinforcement Learning	提出无环境假设强化学习方法以解决样本效率问题	reinforcement learning policy learning
10	Playing Non-Embedded Card-Based Games with Reinforcement Learning	提出非嵌入式强化学习策略，解决视觉输入下皇室战争实时对战问题	reinforcement learning offline reinforcement learning	✅
11	RLBayes: a Bayesian Network Structure Learning Algorithm via Reinforcement Learning-Based Search Strategy	提出RLBayes算法，利用强化学习搜索策略解决贝叶斯网络结构学习的NP难问题。	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
12	Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights	通过量化与本地推理优化大语言模型，降低能耗与碳排放	large language model
13	BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks	BRIDGES：在EDA任务中桥接图模态与大型语言模型，提升性能。	large language model
14	System Log Parsing with Large Language Models: A Review	综述：基于大语言模型的系统日志解析方法研究	large language model
15	LagKV: Lag-Relative Information of the KV Cache Tells Which Tokens Are Important	提出LagKV，通过KV缓存的滞后相对信息实现长文本LLM推理的KV缓存压缩。	large language model	✅
16	GraphRAFT: Retrieval Augmented Fine-Tuning for Knowledge Graphs on Graph Databases	GraphRAFT：用于图数据库知识图谱的检索增强微调方法	large language model
17	Dion: Distributed Orthonormalized Updates	Dion：一种可扩展的分布式正交化更新方法，加速大规模LLM训练。	foundation model	✅
18	DDPM Score Matching and Distribution Learning	将DDPM分数匹配与分布学习关联，提升生成模型统计效率	multimodal
19	Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs	Pr$εε$mpt：一种针对LLM的敏感提示词清洗系统，保护用户隐私。	large language model
20	Mixture-of-Personas Language Models for Population Simulation	提出混合角色语言模型(MoP)用于人口行为模拟，无需微调。	large language model
21	Achieving binary weight and activation for LLMs using Post-Training Quantization	提出W(1+1)A(1*4)量化框架，实现LLM二值化，显著降低计算成本。	large language model	✅

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
22	Unifying Physics- and Data-Driven Modeling via Novel Causal Spatiotemporal Graph Neural Network for Interpretable Epidemic Forecasting	提出CSTGNN融合物理与数据驱动模型，用于可解释的流行病预测。	spatiotemporal
23	Well2Flow: Reconstruction of reservoir states from sparse wells using score-based generative models	Well2Flow：利用基于分数的生成模型，从稀疏井数据重建油藏状态	spatiotemporal

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Rethinking RoPE: A Mathematical Blueprint for N-dimensional Positional Embedding	提出基于李群李代数的N维RoPE数学框架，实现多维位置编码的统一理论	structure preservation large language model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding	提出反馈增强的抗幻觉视觉-语言模型，用于实时场景理解	scene understanding

⬅️ 返回 cs.LG 首页 · 🏠 返回主页