cs.LG(2025-01-14)

📊 共 10 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (4 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (4) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
1 CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning CuAsmRL:利用深度强化学习优化GPU SASS指令调度 reinforcement learning deep reinforcement learning large language model
2 Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision 弱监督下迭代标签优化胜过偏好优化,提升复杂任务性能 reinforcement learning RLHF DPO
3 Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning 提出基于多智能体强化学习的高速铁路动态定价框架,优化运营商收益。 reinforcement learning deep reinforcement learning
4 Reward Compatibility: A Framework for Inverse RL 提出基于奖励兼容性的逆强化学习框架,提升算法在复杂MDP中的效率。 reinforcement learning inverse reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
5 Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation 提出TriProTesting和AdaLogAdjustment,用于检测和缓解Foundation Models中的偏见。 foundation model
6 Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints 提出DART:一种基于扩散模型的LLM红队测试方法,通过近邻约束发现有害行为。 large language model
7 DNN-Powered MLOps Pipeline Optimization for Large Language Models: A Framework for Automated Deployment and Resource Management 提出基于DNN的MLOps优化框架,自动化部署和资源管理大型语言模型。 large language model
8 Gandalf the Red: Adaptive Security for LLMs 提出Gandalf平台与D-SEC模型,用于评估和提升LLM对抗提示攻击的自适应安全性。 large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
9 BiDepth: A Bidirectional-Depth Neural Network for Spatio-Temporal Prediction 提出BiDepth模型,通过双向深度调制和卷积自注意力提升时空预测精度。 spatial relationship multimodal

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
10 On the use of Statistical Learning Theory for model selection in Structural Health Monitoring 利用统计学习理论进行结构健康监测中的模型选择 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页