cs.LG(2025-04-07)

📊 共 25 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗3) 支柱八:物理动画 (Physics-based Animation) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 ACE-RLHF: Automated Code Evaluation and Socratic Feedback Generation Tool using Large Language Models and Reinforcement Learning with Human Feedback 提出ACE-RLHF:利用LLM和RLHF自动生成代码评估与苏格拉底式反馈工具 reinforcement learning RLHF large language model
2 A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization 提出Pairwise-RL,通过统一的成对框架优化RLHF,提升奖励模型校准与策略优化。 reinforcement learning PPO RLHF
3 Attention-Augmented Inverse Reinforcement Learning with Graph Convolutions for Multi-Agent Task Allocation 提出基于注意力机制和图卷积的逆强化学习方法,用于多智能体任务分配 reinforcement learning deep reinforcement learning DRL
4 Efficient Reinforcement Finetuning via Adaptive Curriculum Learning 提出AdaRFT,通过自适应课程学习提升强化微调在数学推理中的效率和准确性 PPO curriculum learning IMoS
5 A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks 提出基于后决策状态和双重Critic网络的PDPPO算法,提升随机环境下强化学习性能 reinforcement learning deep reinforcement learning PPO
6 Bidirectional Hierarchical Protein Multi-Modal Representation Learning 提出双向分层蛋白质多模态表征学习框架,融合序列与结构信息。 representation learning multimodal
7 Gaussian Mixture Flow Matching Models 提出高斯混合流匹配模型(GMFlow),提升少步采样质量并缓解图像生成中的色彩过饱和问题。 flow matching classifier-free guidance
8 Large-Scale Mixed-Traffic and Intersection Control using Multi-agent Reinforcement Learning 提出基于多智能体强化学习的大规模混合交通路口控制方法 reinforcement learning penetration
9 The Role of Environment Access in Agnostic Reinforcement Learning 提出无环境假设强化学习方法以解决样本效率问题 reinforcement learning policy learning
10 Playing Non-Embedded Card-Based Games with Reinforcement Learning 提出非嵌入式强化学习策略,解决视觉输入下皇室战争实时对战问题 reinforcement learning offline reinforcement learning
11 RLBayes: a Bayesian Network Structure Learning Algorithm via Reinforcement Learning-Based Search Strategy 提出RLBayes算法,利用强化学习搜索策略解决贝叶斯网络结构学习的NP难问题。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
12 Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights 通过量化与本地推理优化大语言模型,降低能耗与碳排放 large language model
13 BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks BRIDGES:在EDA任务中桥接图模态与大型语言模型,提升性能。 large language model
14 System Log Parsing with Large Language Models: A Review 综述:基于大语言模型的系统日志解析方法研究 large language model
15 LagKV: Lag-Relative Information of the KV Cache Tells Which Tokens Are Important 提出LagKV,通过KV缓存的滞后相对信息实现长文本LLM推理的KV缓存压缩。 large language model
16 GraphRAFT: Retrieval Augmented Fine-Tuning for Knowledge Graphs on Graph Databases GraphRAFT:用于图数据库知识图谱的检索增强微调方法 large language model
17 Dion: Distributed Orthonormalized Updates Dion:一种可扩展的分布式正交化更新方法,加速大规模LLM训练。 foundation model
18 DDPM Score Matching and Distribution Learning 将DDPM分数匹配与分布学习关联,提升生成模型统计效率 multimodal
19 Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs Pr$εε$mpt:一种针对LLM的敏感提示词清洗系统,保护用户隐私。 large language model
20 Mixture-of-Personas Language Models for Population Simulation 提出混合角色语言模型(MoP)用于人口行为模拟,无需微调。 large language model
21 Achieving binary weight and activation for LLMs using Post-Training Quantization 提出W(1+1)A(1*4)量化框架,实现LLM二值化,显著降低计算成本。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
22 Unifying Physics- and Data-Driven Modeling via Novel Causal Spatiotemporal Graph Neural Network for Interpretable Epidemic Forecasting 提出CSTGNN融合物理与数据驱动模型,用于可解释的流行病预测。 spatiotemporal
23 Well2Flow: Reconstruction of reservoir states from sparse wells using score-based generative models Well2Flow:利用基于分数的生成模型,从稀疏井数据重建油藏状态 spatiotemporal

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
24 Rethinking RoPE: A Mathematical Blueprint for N-dimensional Positional Embedding 提出基于李群李代数的N维RoPE数学框架,实现多维位置编码的统一理论 structure preservation large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
25 Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding 提出反馈增强的抗幻觉视觉-语言模型,用于实时场景理解 scene understanding

⬅️ 返回 cs.LG 首页 · 🏠 返回主页