cs.LG(2025-03-06)

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (18 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
1 TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster 提出TS-RAG,一种基于检索增强生成的时间序列基础模型,显著提升零样本预测能力。 large language model foundation model
2 Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators 提出几何神经算子(GNP)作为点云几何任务的可迁移基础模型。 foundation model
3 Wanda++: Pruning Large Language Models via Regional Gradients Wanda++:利用区域梯度剪枝大语言模型,显著提升性能。 large language model
4 Incentivizing Multi-Tenant Split Federated Learning for Foundation Models at the Network Edge 提出PRINCE机制,激励多租户分割联邦学习在边缘侧高效微调基础模型。 foundation model
5 Predictable Scale: Part I, Step Law -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining 提出Step Law:大规模语言模型预训练超参数优化通用Scaling Law large language model
6 Leveraging Large Language Models to Address Data Scarcity in Machine Learning: Applications in Graphene Synthesis 利用大语言模型解决石墨烯合成中机器学习的数据稀缺问题 large language model
7 Large Language Models for Zero-shot Inference of Causal Structures in Biology 利用大型语言模型零样本推断生物学因果结构 large language model
8 The Challenge of Identifying the Origin of Black-Box Large Language Models 提出PlugAE:一种主动追踪黑盒大语言模型来源的技术 large language model
9 RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems RCRank:提出云数据库系统中慢查询根因多模态排序方法,提升问题诊断与修复效率。 multimodal
10 TimeFound: A Foundation Model for Time Series Forecasting TimeFound:用于时间序列预测的Transformer基础模型,实现零样本预测。 foundation model
11 Continual Pre-training of MoEs: How robust is your router? 研究MoE模型持续预训练的鲁棒性,揭示路由算法对性能的影响 foundation model
12 Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size 提出层级熵权重量化(EWQ),实现模型架构和尺寸无关的LLM选择性量化。 large language model
13 CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models 提出CLDyB动态基准测试框架,解决持续学习中数据污染和基准饱和问题。 foundation model
14 Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges 评估LLM安全评判器的鲁棒性:揭示提示敏感性和对抗攻击下的脆弱性 large language model
15 Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling Speculative MoE:通过推测Token和专家预调度,提升MoE模型通信效率 large language model
16 How to Mitigate Overfitting in Weak-to-strong Generalization? 提出双阶段框架,提升弱监督到强泛化中的过拟合问题 large language model
17 ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions 提出ThrowBench基准测试,用于评估LLM预测运行时异常的能力 large language model
18 PokéChamp: an Expert-level Minimax Language Agent PokéChamp:基于LLM的专家级Minimax宝可梦对战智能体 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
19 Energy-Weighted Flow Matching for Offline Reinforcement Learning 提出能量加权流匹配方法以解决离线强化学习问题 reinforcement learning offline RL offline reinforcement learning
20 scDD: Latent Codes Based scRNA-seq Dataset Distillation with Foundation Model Knowledge scDD:利用基础模型知识的基于潜在编码的scRNA-seq数据集蒸馏 distillation foundation model
21 MTS: A Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling MTS:结合时间感知和卖空策略的深度强化学习投资组合管理框架 reinforcement learning deep reinforcement learning
22 Learning Transformer-based World Models with Contrastive Predictive Coding 提出TWISTER:基于对比预测编码学习Transformer世界模型,提升强化学习性能 reinforcement learning world model dreamer
23 Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning 提出可证明正确的自动机嵌入,用于最优自动机条件强化学习 reinforcement learning policy learning
24 Knowledge Retention for Continual Model-Based Reinforcement Learning DRAGO:面向持续模型强化学习的知识保留方法 reinforcement learning world model
25 Can We Optimize Deep RL Policy Weights as Trajectory Modeling? 提出TIPL模型,利用Transformer建模深度强化学习策略权重轨迹,优化策略学习。 reinforcement learning deep reinforcement learning DRL
26 Accurate predictive model of band gap with selected important features based on explainable machine learning 提出基于可解释机器学习的带隙预测模型,提升泛化能力并降低计算成本。 predictive model
27 DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models 提出难度自适应慢思考(DAST)框架,解决大模型推理中的过度思考问题。 reward shaping chain-of-thought
28 Frequency Hopping Synchronization by Reinforcement Learning for Satellite Communication System 提出基于强化学习的跳频同步方法,提升卫星通信系统抗干扰能力 reinforcement learning
29 Quantum-Inspired Reinforcement Learning in the Presence of Epistemic Ambivalence 提出EA-MDP框架与EA-epsilon-greedy Q-learning算法,解决认知矛盾下的强化学习问题 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
30 Federated Dynamic Modeling and Learning for Spatiotemporal Data Forecasting 提出联邦动态建模与学习框架,用于时空数据预测,提升精度与隐私保护。 spatiotemporal multimodal
31 Topology-Aware Conformal Prediction for Stream Networks 提出STACI,解决流网络中拓扑感知的置信度预测问题 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
32 Poisoning Attacks to Local Differential Privacy Protocols for Trajectory Data 提出TraP算法,针对轨迹数据本地差分隐私协议发起高效投毒攻击 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页