cs.LG(2024-05-24)
📊 共 6 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications | 提出基于基选择的低秩分解方法,用于压缩LLM以适应特定应用。 | large language model | ||
| 2 | Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information | Athena:利用二阶矩阵导数信息高效量化大型语言模型 | large language model | ||
| 3 | Transformers represent belief state geometry in their residual stream | Transformer在残差流中以线性方式表征信念状态几何结构,蕴含未来信息。 | large language model | ||
| 4 | Pipeline Parallelism with Controllable Memory | 提出可控内存的流水线并行框架,显著提升大模型训练吞吐量。 | large language model | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks | 提出动态系统框架DSF,统一分析Attention、SSM和RNN,揭示高效Foundation Model设计原则。 | SSM state space model linear attention | ||
| 6 | Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models | Intelligent Go-Explore:利用大型预训练模型解决复杂探索问题 | reinforcement learning foundation model |