cs.LG(2025-01-25)

📊 共 18 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
1 Technology Mapping with Large Language Models 提出STARS框架,利用大语言模型进行精准的企业技术栈图谱构建。 large language model chain-of-thought
2 Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink 利用注意力Sink机制,对多模态大语言模型发起幻觉攻击 large language model
3 ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning ToMoE:通过动态结构剪枝将稠密大语言模型转化为混合专家模型 large language model
4 PIP: Perturbation-based Iterative Pruning for Large Language Models PIP:基于扰动的迭代剪枝方法,用于优化大型语言模型 large language model
5 FBQuant: FeedBack Quantization for Large Language Models 提出FBQuant,利用反馈量化方法优化大语言模型,提升端侧部署精度。 large language model
6 Fairness in LLM-Generated Surveys 提出评估LLM生成调查公平性的框架,揭示社会人口偏差并提升模型公平性 large language model
7 Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models 提出COMP:一种轻量级的LLM后训练结构化剪枝方法,适用于端侧部署。 large language model
8 Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning 提出SMoRA:一种单秩混合专家LoRA,用于解决多任务学习中的任务冲突问题。 large language model
9 RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations RotateKV:通过自适应旋转实现LLM的精确鲁棒的2比特KV缓存量化 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
10 Extensive Exploration in Complex Traffic Scenarios using Hierarchical Reinforcement Learning 提出基于分层强化学习的复杂交通场景自动驾驶方案 reinforcement learning deep reinforcement learning DRL
11 Reinforcement Learning Controlled Adaptive PSO for Task Offloading in IIoT Edge Computing 提出基于强化学习控制的自适应PSO算法,用于IIoT边缘计算中的任务卸载。 reinforcement learning SAC predictive model
12 Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning 提出参考模型引导采样策略,提升偏好学习数据质量和效率 preference learning DPO direct preference optimization
13 Inductive Biases for Zero-shot Systematic Generalization in Language-informed Reinforcement Learning 提出基于神经产生式系统和记忆增强的语言引导强化学习模型,提升零样本系统泛化能力。 reinforcement learning
14 Predictive Modeling and Uncertainty Quantification of Fatigue Life in Metal Alloys using Machine Learning 融合物理模型与机器学习,提升金属疲劳寿命预测精度与不确定性量化 predictive model
15 On Accelerating Edge AI: Optimizing Resource-Constrained Environments 针对资源受限边缘AI,探索深度学习模型加速与优化策略 distillation large language model
16 Reliable Pseudo-labeling via Optimal Transport with Attention for Short Text Clustering 提出POTA框架,利用最优传输和注意力机制进行可靠伪标签生成,提升短文本聚类效果。 representation learning contrastive learning
17 Divergence-Augmented Policy Optimization 提出DAPO方法,通过散度增强策略优化,提升离线数据复用下的强化学习性能。 reinforcement learning deep reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
18 Predictive Lagrangian Optimization for Constrained Reinforcement Learning 提出预测拉格朗日优化算法,通过模型预测控制提升约束强化学习性能 MPC model predictive control reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页