cs.LG（2024-05-24）

📊 共 6 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

#	题目	一句话要点	标签	🔗
1	Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications	提出基于基选择的低秩分解方法，用于压缩LLM以适应特定应用。	large language model
2	Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information	Athena：利用二阶矩阵导数信息高效量化大型语言模型	large language model
3	Transformers represent belief state geometry in their residual stream	Transformer在残差流中以线性方式表征信念状态几何结构，蕴含未来信息。	large language model
4	Pipeline Parallelism with Controllable Memory	提出可控内存的流水线并行框架，显著提升大模型训练吞吐量。	large language model	✅

#	题目	一句话要点	标签	🔗	⭐
5	Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks	提出动态系统框架DSF，统一分析Attention、SSM和RNN，揭示高效Foundation Model设计原则。	SSM state space model linear attention
6	Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models	Intelligent Go-Explore：利用大型预训练模型解决复杂探索问题	reinforcement learning foundation model