cs.LG（2025-03-28）

📊 共 25 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一：机器人控制 (Robot Control) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
1	A Survey of Circuit Foundation Model: Foundation AI Models for VLSI Circuit Design and EDA	综述电路基础模型：用于VLSI电路设计和EDA的基础AI模型	representation learning large language model foundation model
2	Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback	针对RLHF中数据瓶颈，提出混合奖励与Prompt选择方法，提升模型性能与多样性	reinforcement learning PPO RLHF
3	Arch-LLM: Taming LLMs for Neural Architecture Generation via Unsupervised Discrete Representation Learning	Arch-LLM：利用无监督离散表示学习，驯服LLM以生成神经架构	representation learning VQ-VAE large language model
4	Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models	Quamba2：一种稳健且可扩展的后训练量化框架，用于选择性状态空间模型	Mamba SSM state space model	✅
5	RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack	提出RLDBF方法，利用数据库反馈强化学习提升LLM在化学分子科学中的性能。	reinforcement learning large language model
6	Efficient Verified Machine Unlearning For Distillation	提出PURGE框架，加速知识蒸馏场景下的高效可验证机器卸载	teacher-student distillation
7	Probabilistic Uncertain Reward Model	提出概率不确定奖励模型(PURM)，解决RLHF中奖励模型过度自信问题	reinforcement learning RLHF large language model
8	Generative Latent Neural PDE Solver using Flow Matching	提出基于Flow Matching的生成式隐空间神经PDE求解器，提升精度和长期稳定性。	flow matching
9	Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments	提出基于多臂老虎机的强化学习方法，用于自动化机器学习模型部署与管理。	reinforcement learning
10	Policy Optimization and Multi-agent Reinforcement Learning for Mean-variance Team Stochastic Games	针对均值-方差团队随机博弈，提出基于策略优化的多智能体强化学习算法	reinforcement learning
11	Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning	提出EGSW，通过熵引导序列加权提升RL微调LLM的探索效率	reinforcement learning large language model
12	Invariant Control Strategies for Active Flow Control using Graph Neural Networks	提出基于图神经网络的流体主动控制策略，提升泛化性并降低计算成本。	reinforcement learning spatial relationship
13	Fuzzy Cluster-Aware Contrastive Clustering for Time Series	提出模糊聚类感知的对比聚类框架FCACC，用于提升时间序列的无监督聚类效果	representation learning contrastive learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
14	Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery	评估基础模型在Sentinel-1 SAR影像中海冰类型分割的性能，并分析其泛化能力。	foundation model
15	Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models	提出FI指标，揭示大语言模型和视觉语言模型对扰动的脆弱性根源	large language model
16	Generative Reliability-Based Design Optimization Using In-Context Learning Capabilities of Large Language Models	提出一种基于大语言模型上下文学习能力的生成式可靠性设计优化方法	large language model
17	Reasoning of Large Language Models over Knowledge Graphs with Super-Relations	ReKnoS：利用超关系增强LLM在知识图谱上的推理能力	large language model
18	Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models	提出Landscape of Thoughts，可视化LLM推理过程，诊断模型行为。	large language model	✅
19	Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey	综述：多模态机器学习在房地产评估中的应用研究	multimodal
20	Tokenization of Gaze Data	针对眼动数据，提出五种tokenization策略，用于LLM的眼动预测与生成任务。	large language model multimodal
21	Niyama : Breaking the Silos of LLM Inference Serving	Niyama：突破LLM推理服务的孤岛，实现QoS驱动的资源高效共享	large language model
22	STADE: Standard Deviation as a Pruning Metric	提出基于输入标准差的剪枝方法STADE，提升LLM在不同训练条件下的剪枝泛化性。	large language model	✅
23	Few-Shot Graph Out-of-Distribution Detection with LLMs	LLM-GOOD：结合LLM与GNN的少样本图OOD检测框架，降低标注成本	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models	提出Task Tokens，一种灵活调整行为基础模型以适应特定任务的方法。	humanoid reinforcement learning imitation learning

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Time-resolved dynamic CBCT reconstruction using prior-model-free spatiotemporal Gaussian representation (PMF-STGR)	提出基于无先验模型时空高斯表示的动态CBCT重建方法，实现快速精确的动态CBCT成像。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页