cs.LG（2025-03-06）

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (18 🔗3) 支柱二：RL算法与架构 (RL & Architecture) (11 🔗1) 支柱八：物理动画 (Physics-based Animation) (2) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (18 篇)

#	题目	一句话要点	标签	🔗	⭐
1	TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster	提出TS-RAG，一种基于检索增强生成的时间序列基础模型，显著提升零样本预测能力。	large language model foundation model	✅
2	Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators	提出几何神经算子（GNP）作为点云几何任务的可迁移基础模型。	foundation model
3	Wanda++: Pruning Large Language Models via Regional Gradients	Wanda++：利用区域梯度剪枝大语言模型，显著提升性能。	large language model
4	Incentivizing Multi-Tenant Split Federated Learning for Foundation Models at the Network Edge	提出PRINCE机制，激励多租户分割联邦学习在边缘侧高效微调基础模型。	foundation model
5	Predictable Scale: Part I, Step Law -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining	提出Step Law：大规模语言模型预训练超参数优化通用Scaling Law	large language model	✅
6	Leveraging Large Language Models to Address Data Scarcity in Machine Learning: Applications in Graphene Synthesis	利用大语言模型解决石墨烯合成中机器学习的数据稀缺问题	large language model
7	Large Language Models for Zero-shot Inference of Causal Structures in Biology	利用大型语言模型零样本推断生物学因果结构	large language model
8	The Challenge of Identifying the Origin of Black-Box Large Language Models	提出PlugAE：一种主动追踪黑盒大语言模型来源的技术	large language model
9	RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems	RCRank：提出云数据库系统中慢查询根因多模态排序方法，提升问题诊断与修复效率。	multimodal
10	TimeFound: A Foundation Model for Time Series Forecasting	TimeFound：用于时间序列预测的Transformer基础模型，实现零样本预测。	foundation model
11	Continual Pre-training of MoEs: How robust is your router?	研究MoE模型持续预训练的鲁棒性，揭示路由算法对性能的影响	foundation model
12	Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size	提出层级熵权重量化（EWQ），实现模型架构和尺寸无关的LLM选择性量化。	large language model
13	CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models	提出CLDyB动态基准测试框架，解决持续学习中数据污染和基准饱和问题。	foundation model	✅
14	Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges	评估LLM安全评判器的鲁棒性：揭示提示敏感性和对抗攻击下的脆弱性	large language model
15	Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling	Speculative MoE：通过推测Token和专家预调度，提升MoE模型通信效率	large language model
16	How to Mitigate Overfitting in Weak-to-strong Generalization?	提出双阶段框架，提升弱监督到强泛化中的过拟合问题	large language model
17	ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions	提出ThrowBench基准测试，用于评估LLM预测运行时异常的能力	large language model
18	PokéChamp: an Expert-level Minimax Language Agent	PokéChamp：基于LLM的专家级Minimax宝可梦对战智能体	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Energy-Weighted Flow Matching for Offline Reinforcement Learning	提出能量加权流匹配方法以解决离线强化学习问题	reinforcement learning offline RL offline reinforcement learning
20	scDD: Latent Codes Based scRNA-seq Dataset Distillation with Foundation Model Knowledge	scDD：利用基础模型知识的基于潜在编码的scRNA-seq数据集蒸馏	distillation foundation model
21	MTS: A Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling	MTS：结合时间感知和卖空策略的深度强化学习投资组合管理框架	reinforcement learning deep reinforcement learning
22	Learning Transformer-based World Models with Contrastive Predictive Coding	提出TWISTER：基于对比预测编码学习Transformer世界模型，提升强化学习性能	reinforcement learning world model dreamer
23	Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning	提出可证明正确的自动机嵌入，用于最优自动机条件强化学习	reinforcement learning policy learning
24	Knowledge Retention for Continual Model-Based Reinforcement Learning	DRAGO：面向持续模型强化学习的知识保留方法	reinforcement learning world model
25	Can We Optimize Deep RL Policy Weights as Trajectory Modeling?	提出TIPL模型，利用Transformer建模深度强化学习策略权重轨迹，优化策略学习。	reinforcement learning deep reinforcement learning DRL
26	Accurate predictive model of band gap with selected important features based on explainable machine learning	提出基于可解释机器学习的带隙预测模型，提升泛化能力并降低计算成本。	predictive model
27	DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models	提出难度自适应慢思考（DAST）框架，解决大模型推理中的过度思考问题。	reward shaping chain-of-thought	✅
28	Frequency Hopping Synchronization by Reinforcement Learning for Satellite Communication System	提出基于强化学习的跳频同步方法，提升卫星通信系统抗干扰能力	reinforcement learning
29	Quantum-Inspired Reinforcement Learning in the Presence of Epistemic Ambivalence	提出EA-MDP框架与EA-epsilon-greedy Q-learning算法，解决认知矛盾下的强化学习问题	reinforcement learning

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
30	Federated Dynamic Modeling and Learning for Spatiotemporal Data Forecasting	提出联邦动态建模与学习框架，用于时空数据预测，提升精度与隐私保护。	spatiotemporal multimodal
31	Topology-Aware Conformal Prediction for Stream Networks	提出STACI，解决流网络中拓扑感知的置信度预测问题	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
32	Poisoning Attacks to Local Differential Privacy Protocols for Trajectory Data	提出TraP算法，针对轨迹数据本地差分隐私协议发起高效投毒攻击	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页