cs.LG（2025-05-13）

📊 共 27 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14) 支柱九：具身大模型 (Embodied Foundation Models) (11 🔗3) 支柱一：机器人控制 (Robot Control) (1) 支柱五：交互与反应 (Interaction & Reaction) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	A Practical Introduction to Deep Reinforcement Learning	深度强化学习教程：以PPO算法为例，提供实用入门指南	reinforcement learning deep reinforcement learning DRL
2	Block-Biased Mamba for Long-Range Sequence Processing	提出Block-Biased Mamba（B2S6）以提升Mamba在长序列任务上的性能。	Mamba SSM state space model
3	InfoPO: On Mutual Information Maximization for Large Language Model Alignment	提出InfoPO，通过互信息最大化提升大语言模型对齐效果	direct preference optimization large language model
4	Cost Function Estimation Using Inverse Reinforcement Learning with Minimal Observations	提出一种基于少量观测的逆强化学习算法，用于连续空间中的代价函数估计。	reinforcement learning inverse reinforcement learning
5	DyGSSM: Multi-view Dynamic Graph Embeddings with State Space Model Gradient Update	DyGSSM：结合状态空间模型梯度更新的多视角动态图嵌入方法	SSM state space model representation learning
6	DSADF: Thinking Fast and Slow for Decision Making	提出DSADF双系统决策框架，提升强化学习智能体在动态环境中的泛化能力	reinforcement learning large language model foundation model
7	Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments	提出Mamba模型的非结构化剪枝方法，用于资源受限环境下的高效部署	Mamba SSM
8	A Multi-scale Representation Learning Framework for Long-Term Time Series Forecasting	MDMixer：用于长期时间序列预测的多尺度表征学习框架	representation learning MAE
9	Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL	提出FASP框架，解决离线安全强化学习中长时安全和泛化性问题	reinforcement learning offline RL
10	Continual Reinforcement Learning via Autoencoder-Driven Task and New Environment Recognition	提出自编码器驱动的任务与新环境识别方法以解决持续强化学习问题	reinforcement learning
11	Constrained Edge AI Deployment: Fine-Tuning vs Distillation for LLM Compression	针对边缘AI部署，研究LLM压缩中微调与蒸馏的性能差异	distillation
12	Credit Assignment and Efficient Exploration based on Influence Scope in Multi-agent Reinforcement Learning	提出基于影响范围的多智能体强化学习方法，解决稀疏奖励下的信用分配和高效探索问题。	reinforcement learning
13	SPAT: Sensitivity-based Multihead-attention Pruning on Time Series Forecasting Models	SPAT：基于敏感度的多头注意力剪枝方法，提升时间序列预测模型效率。	Mamba MAE
14	Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer	提出低复杂度推理框架以解决持续学习中的计算成本问题	teacher-student distillation

🔬 支柱九：具身大模型 (Embodied Foundation Models) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
15	Generalizing Large Language Model Usability Across Resource-Constrained	提出通用LLM可用性框架，提升资源受限场景下的多模态和低资源任务性能	large language model multimodal
16	Large Language Models for Computer-Aided Design: A Survey	首个LLM在CAD领域应用的综述，总结六大应用方向并展望未来	large language model	✅
17	AI Accelerators for Large Language Model Inference: Architecture Analysis and Scaling Strategies	针对大语言模型推理，论文分析AI加速器架构并提出扩展策略	large language model
18	Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data	为实验读出系统构建融合离散与连续数据的核物理领域Proto Foundation Model	foundation model
19	ExEBench: Benchmarking Foundation Models on Extreme Earth Events	ExEBench：极端地球事件基础模型评测基准，助力灾害管理	foundation model	✅
20	Model-Distributed Inference for Large Language Models at the Edge	提出MDI-LLM，实现大语言模型在边缘设备的模型分布式推理	large language model
21	Automatic detection of abnormal clinical EEG: comparison of a finetuned foundation model with two deep learning models	利用微调的预训练模型BioSerenity-E1实现脑电图异常自动检测	foundation model
22	DPL: Decoupled Prototype Learning for Enhancing Robustness of Vision-Language Transformers to Missing Modalities	DPL：解耦原型学习增强视觉-语言Transformer在模态缺失下的鲁棒性	multimodal
23	CodePDE: An Inference Framework for LLM-driven PDE Solver Generation	CodePDE：利用大语言模型生成偏微分方程求解器的推理框架	large language model	✅
24	PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	提出PWC-MoE框架，解决带宽受限环境下LLM的隐私保护和性能平衡问题	large language model
25	Deep Probabilistic Modeling of User Behavior for Anomaly Detection via Mixture Density Networks	提出基于深度混合密度网络的异常检测方法，提升复杂用户行为异常模式识别能力。	multimodal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
26	LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism Identification	利用因果机制识别分析LLM增强GNN，并提出优化模块提升信息传递	manipulation representation learning large language model

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Privacy-Preserving Analytics for Smart Meter (AMI) Data: A Hybrid Approach to Comply with CPUC Privacy Regulations	提出混合隐私保护架构，解决智能电表数据分析中的隐私合规问题	OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页