cs.LG(2024-09-19)

📊 共 18 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs) PromSec:通过提示优化,安全生成大语言模型的功能性源代码 contrastive learning large language model
2 Evaluating Defences against Unsafe Feedback in RLHF 评估RLHF中针对不安全反馈的防御机制,揭示现有方法的局限性。 reinforcement learning RLHF large language model
3 Training Language Models to Self-Correct via Reinforcement Learning 提出SCoRe,通过强化学习显著提升大语言模型的自我纠错能力 reinforcement learning large language model
4 Privacy-Preserving Student Learning with Differentially Private Data-Free Distillation 提出基于差分隐私无数据蒸馏的隐私保护学生学习方法 teacher-student distillation
5 Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL 提出CALM框架,利用LLM零样本能力解决强化学习中的动作评估问题 reinforcement learning reward shaping large language model
6 Revisiting Semi-supervised Adversarial Robustness via Noise-aware Online Robust Distillation 提出SNORD框架,通过噪声感知在线鲁棒蒸馏提升半监督对抗鲁棒性,无需预训练模型。 distillation
7 Disentangling Recognition and Decision Regrets in Image-Based Reinforcement Learning 提出基于图像强化学习的识别后悔与决策后悔解耦方法,提升泛化性能 reinforcement learning
8 The Central Role of the Loss Function in Reinforcement Learning 强化学习中损失函数的中心作用:影响样本效率和自适应性 reinforcement learning
9 VCAT: Vulnerability-aware and Curiosity-driven Adversarial Training for Enhancing Autonomous Vehicle Robustness 提出VCAT,增强自动驾驶车辆在对抗攻击下的鲁棒性 reinforcement learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
10 Exploring Representations and Interventions in Time Series Foundation Models 探索时间序列基础模型中的表征与干预,实现模型优化与可控分析 foundation model
11 FoME: A Foundation Model for EEG using Adaptive Temporal-Lateral Attention Scaling FoME:基于自适应时序-横向注意力缩放的脑电图(EEG)基础模型 foundation model
12 Is Tokenization Needed for Masked Particle Modelling? 改进掩码粒子建模,无需tokenization即可实现高能物理基础模型 foundation model
13 Scaling FP8 training to trillion-token LLMs 提出Smooth-SwiGLU,解决FP8训练LLM时因SwiGLU激活函数引起的长期不稳定性问题。 large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
14 Tokenization for Molecular Foundation Models 提出Smirk和Smirk-GPE分子分词器,提升分子Foundation Model对化学空间的覆盖率。 open-vocabulary open vocabulary foundation model
15 Shape-informed surrogate models based on signed distance function domain encoding 提出基于符号距离函数的形状感知代理模型,高效求解参数化偏微分方程。 implicit representation

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
16 Hybrid Ensemble Deep Graph Temporal Clustering for Spatiotemporal Data 提出混合集成深度图时序聚类(HEDGTC)方法,用于提升时空数据聚类性能。 spatiotemporal
17 Unrolled denoising networks provably learn optimal Bayesian inference 提出基于展开降噪网络的贝叶斯推断学习框架,可证明地学习最优贝叶斯推断。 AMP

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
18 Unsupervised Reward-Driven Image Segmentation in Automated Scanning Transmission Electron Microscopy Experiments 提出基于无监督奖励驱动的图像分割方法,用于自动化扫描透射电子显微镜实验。 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页