cs.LG(2024-11-12)

📊 共 20 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
1 ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization ASER:通过激活平滑与误差重构实现大语言模型低比特量化 large language model
2 Zer0-Jack: A Memory-efficient Gradient-based Jailbreaking Method for Black-box Multi-modal Large Language Models Zer0-Jack:一种面向黑盒多模态大语言模型的内存高效梯度越狱方法 large language model
3 Retrieval Augmented Time Series Forecasting 提出检索增强时间序列预测框架RAF,提升时间序列基础模型在多样化场景下的零样本预测精度。 foundation model
4 NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs 提出NVCiM辅助的Prompt Tuning框架,解决边缘LLM领域迁移问题。 large language model
5 FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training FRUGAL:通过减少状态开销实现内存高效优化,用于可扩展训练 large language model
6 Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices 提出一种高效联邦微调Tiny Transformer的层微调方案,解决资源受限设备上的内存和计算瓶颈。 large language model
7 Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks 提出基于无线网络差分隐私的联邦低秩自适应框架,解决边缘设备微调大模型时的计算和隐私挑战。 foundation model
8 What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? 通过学习动态揭示LLM推理泛化能力:提出预记忆训练准确率指标 large language model
9 Circuit Complexity Bounds for RoPE-based Transformer Architecture 证明RoPE Transformer在特定复杂度类下的表达能力存在根本限制 large language model
10 Model Stealing for Any Low-Rank Language Model 针对低秩语言模型的模型窃取算法,提升了窃取效率和适用性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
11 LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models LLMPhy:利用大语言模型和世界模型进行复杂物理推理 world model large language model
12 Entropy Controllable Direct Preference Optimization 提出H-DPO,通过熵控制增强DPO的性能,提升LLM在数学任务中的表现。 reinforcement learning RLHF DPO
13 Robust Offline Reinforcement Learning for Non-Markovian Decision Processes 提出一种鲁棒离线强化学习算法,解决非马尔可夫决策过程中的不确定性问题 reinforcement learning offline reinforcement learning distillation
14 Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning 提出基于重要性驱动的深度强化学习测试方法,聚焦安全关键决策。 reinforcement learning deep reinforcement learning
15 Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning 提出QPHIL:一种用于分层隐式Q学习的量化规划器,提升复杂导航任务性能。 reinforcement learning offline RL offline reinforcement learning
16 Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization 通过近似分解克服强化学习中的维度灾难,提升样本效率 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
17 Doubly Mild Generalization for Offline Reinforcement Learning 提出双重适度泛化(DMG)方法,提升离线强化学习性能。 locomotion reinforcement learning offline RL
18 Top-$nσ$: Not All Logits Are You Need 提出Top-$nσ$采样方法,利用统计阈值提升LLM推理任务的准确性和多样性。 manipulation large language model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
19 Privacy-Preserving Verifiable Neural Network Inference Service 提出vPIN:一种保护隐私且可验证的神经网络推理服务方案 OMOMO

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
20 Spatially Regularized Graph Attention Autoencoder Framework for Detecting Rainfall Extremes 提出空间正则化图注意力自编码器,用于检测印度极端降雨事件 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页