cs.LG(2024-11-18)

📊 共 27 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱一:机器人控制 (Robot Control) (1) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT MMBind:利用分布式异构数据进行物联网多模态学习 contrastive learning foundation model multimodal
2 Preserving Expert-Level Privacy in Offline Reinforcement Learning 提出一种共识专家级差分隐私离线强化学习方法,保护专家隐私。 reinforcement learning offline RL offline reinforcement learning
3 METEOR: Evolutionary Journey of Large Language Models from Guidance to Self-Growth 提出METEOR方法,引导大语言模型从指导学习到自主进化 distillation large language model
4 Dissecting Representation Misalignment in Contrastive Learning via Influence Function 提出ECIF:通过扩展影响函数解决对比学习中表征错位问题 contrastive learning multimodal
5 EXCON: Extreme Instance-based Contrastive Representation Learning of Severely Imbalanced Multivariate Time Series for Solar Flare Prediction EXCON:基于极端实例对比学习的太阳耀斑预测方法,解决严重不平衡多元时间序列问题 predictive model representation learning contrastive learning
6 Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework 构建人类反馈强化学习空间:提出概念框架以统一反馈类型和质量评估。 reinforcement learning RLHF
7 Theoretical Corrections and the Leveraging of Reinforcement Learning to Enhance Triangle Attack 提出基于强化学习的三角攻击TARL,提升黑盒对抗攻击效率。 reinforcement learning
8 Robust Reinforcement Learning under Diffusion Models for Data with Jumps 提出MSBVE算法,增强强化学习在跳跃扩散模型下的鲁棒性与收敛性 reinforcement learning
9 Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets Value Imprint:一种审计RLHF数据集中嵌入人类价值观的技术 RLHF
10 Near-Optimal Reinforcement Learning with Shuffle Differential Privacy 提出SDP-PE算法,在Shuffle差分隐私下实现近优强化学习,解决网络系统隐私泄露问题。 reinforcement learning
11 Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning 提出基于时序高斯混合模型的结构学习方法,用于模型驱动的强化学习。 reinforcement learning
12 Continual Task Learning through Adaptive Policy Self-Composition 提出CompoFormer,通过自适应策略组合解决离线持续强化学习中的灾难性遗忘问题 reinforcement learning offline RL offline reinforcement learning
13 Aligning Few-Step Diffusion Models with Dense Reward Difference Learning 提出SDPO,通过密集奖励差异学习对齐少步扩散模型,提升步泛化能力 reinforcement learning diffusion policy
14 Reinforced Symbolic Learning with Logical Constraints for Predicting Turbine Blade Fatigue Life 提出基于强化学习的符号学习方法RSL,用于预测涡轮叶片疲劳寿命 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
15 LLM-IE: A Python Package for Generative Information Extraction with Large Language Models LLM-IE:用于生成式信息抽取的Python软件包,交互式LLM Agent辅助流程构建。 large language model
16 Unveiling and Addressing Pseudo Forgetting in Large Language Models 揭示并解决大语言模型中的伪遗忘现象,提升持续学习能力 large language model
17 Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers 研究Transformer多层结构中堆叠注意力头的机制与涌现现象 large language model
18 Random Forest-Supervised Manifold Alignment 提出基于随机森林监督的流形对齐方法,提升跨域分类任务性能。 multimodal
19 Tackling prediction tasks in relational databases with LLMs 利用大型语言模型解决关系数据库中的预测任务 large language model
20 Parallelly Tempered Generative Adversarial Nets: Toward Stabilized Gradients 提出并行退火GAN,通过稳定梯度解决GAN训练中的模式崩塌问题。 multimodal
21 BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration BitMoD:一种面向低精度LLM加速的混合数据类型位串行算法-硬件协同设计方案 large language model
22 TSINR: Capturing Temporal Continuity via Implicit Neural Representations for Time Series Anomaly Detection 提出TSINR,利用隐式神经表示捕捉时间连续性,用于时间序列异常检测。 large language model
23 Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions 提出一种中间件架构,在资源受限场景下预判差分隐私文本清洗对LLM效用的影响,避免资源浪费。 large language model
24 Re-examining learning linear functions in context 研究表明Transformer在上下文学习线性函数时,未采用线性回归等算法方法。 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
25 Bridging the Resource Gap: Deploying Advanced Imitation Learning Models onto Affordable Embedded Platforms 提出一种高效迁移方案,将先进模仿学习模型部署到低成本嵌入式平台 manipulation imitation learning

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
26 Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting 提出基于PCA嵌入的交通预测模型,提升时空图神经网络的泛化能力。 spatial relationship spatiotemporal

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
27 Exploring Eye Tracking to Detect Cognitive Load in Complex Virtual Reality Training 利用眼动追踪技术检测复杂VR训练中的认知负荷 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页