cs.LG(2024-12-18)

📊 共 15 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (7) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts 提出自适应概念瓶颈模型,提升基础模型在分布偏移下的可解释性和准确率 foundation model
2 Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model 利用SciML基础模型,提升神经流体场数据效率与泛化性 foundation model
3 ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals 提出ResQ以解决大语言模型量化中的高量化误差问题 large language model
4 Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report 探索AI在数字游戏中的未来研究方向,聚焦深度学习应用 large language model
5 Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes 提出基于神经过程的少样本可控对齐框架,解决LLM个性化偏好对齐问题 large language model
6 Information-Theoretic Generative Clustering of Documents 提出基于信息论的生成式聚类方法,利用LLM提升文档聚类性能 large language model
7 GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration 提出GMoE框架,通过MoE图协作增强LLM微调的稳定性和效率。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
8 Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation 提出RLSQM,利用强化学习和静态质量指标自动生成高质量单元测试 reinforcement learning large language model
9 Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning 揭示Dyna-style强化学习在不同环境下的性能差异与局限性 reinforcement learning model-based RL
10 Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference 提出交错异步推理,解决实时强化学习中大规模模型推理延迟问题 reinforcement learning
11 Distributionally Robust Policy Learning under Concept Drifts 提出概念漂移下的分布鲁棒策略学习方法,提升策略在变化环境中的泛化能力 policy learning
12 Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs 提出QPMIX异构多智能体强化学习框架,解决WLAN分布式信道接入问题 reinforcement learning
13 Harvesting energy from turbulent winds with Reinforcement Learning 提出基于强化学习的空中风能系统控制方法,解决湍流环境下的能量获取问题 reinforcement learning
14 Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model 提出基于能量的偏好模型,解决Bradley-Terry模型在离线对齐中的多解问题,提升LLM对齐效果。 RLHF DPO

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
15 Nemesis: Noise-randomized Encryption with Modular Efficiency and Secure Integration in Machine Learning Systems Nemesis:通过噪声随机化加密和模块化效率加速FHE机器学习系统。 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页