cs.LG(2024-10-14)

📊 共 41 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (19 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (16 🔗3) 支柱三:空间感知与语义 (Perception & Semantics) (2 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (19 篇)

#题目一句话要点标签🔗
1 AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization Alpha-DPO:通过自适应奖励边际优化直接偏好,提升LLM对齐效果 reinforcement learning RLHF DPO
2 Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach 提出基于最大李雅普诺夫指数正则化的Dreamer V3,提升深度强化学习在连续控制任务中的鲁棒性。 reinforcement learning deep reinforcement learning dreamer
3 Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation 提出基于PackNet的持续深度强化学习方法,解决抗干扰通信中灾难性遗忘问题。 reinforcement learning deep reinforcement learning DRL
4 BrainGPT: Unleashing the Potential of EEG Generalist Foundation Model by Autoregressive Pre-training 提出BrainGPT,通过自回归预训练释放脑电图通用基础模型的潜力 masked autoencoder foundation model
5 LoLCATs: On Low-Rank Linearizing of Large Language Models LoLCATs:通过低秩线性化方法提升大型语言模型的效率与性能 linear attention large language model
6 Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning 利用空间条件增强JEPA,实现更鲁棒高效的表征学习 representation learning masked autoencoder MAE
7 HGAurban: Heterogeneous Graph Autoencoding for Urban Spatial-Temporal Learning 提出HGAurban,利用异构图自编码器解决城市时空数据学习中的噪声和稀疏性问题。 masked autoencoder spatial relationship spatiotemporal
8 Mimetic Initialization Helps State Space Models Learn to Recall 提出一种模仿初始化方法,提升状态空间模型在记忆任务中的学习能力 Mamba state space model
9 Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning 提出分布式强化学习方法以解决高频决策中的性能问题 reinforcement learning DRL
10 The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels 揭示结构化状态空间模型易受干净标签投毒攻击的脆弱性 SSM state space model
11 Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models 提出FiFA,通过自动过滤人类反馈数据,提升文本到图像扩散模型的对齐效果。 DPO direct preference optimization large language model
12 StatioCL: Contrastive Learning for Time Series via Non-Stationary and Temporal Contrast StatioCL:通过非平稳性和时间对比学习提升时间序列表征,解决假阴性样本问题。 representation learning contrastive learning
13 Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning 对比DCQN与DTQN在Atari游戏中性能,发现DCQN在速度和多数游戏上更优 reinforcement learning
14 Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes 提出RED框架,解决平均奖励MDP中子任务学习和风险感知强化学习问题 reinforcement learning
15 Revisiting and Benchmarking Graph Autoencoders: A Contrastive Learning Perspective 提出lrGAE:基于对比学习的图自编码器框架,为图表示学习建立新基准。 contrastive learning
16 Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism 提出基于更紧致的成本悲观和奖励乐观估计的安全强化学习算法,提升Regret上界。 reinforcement learning
17 Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning 提出稳定哈达玛记忆,增强强化学习智能体在部分可观测环境下的记忆能力 reinforcement learning
18 Learning Linear Attention in Polynomial Time 提出线性注意力Transformer多项式时间可学习性理论框架,并验证其在有限自动机等任务上的有效性。 linear attention
19 Lambda-Skip Connections: the architectural component that prevents Rank Collapse 提出Lambda-Skip连接,从架构层面预防序列模型中的秩崩溃问题 SSM state space model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
20 GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs GraphCLIP:通过图-文本对比预训练增强文本属性图的迁移能力 large language model foundation model zero-shot transfer
21 Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection 提出Adapt-$\infty$以解决多模态指令调优中的数据冗余问题 large language model multimodal
22 Federated Data-Efficient Instruction Tuning for Large Language Models 提出FedHDS,一种联邦数据高效指令调优方法,提升LLM在边缘侧的训练效率和泛化能力。 large language model
23 Model-based Large Language Model Customization as Service Llamdex:一种基于模型上传的LLM定制服务,保护用户数据隐私 large language model
24 Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts 提出Moirai-MoE,利用稀疏专家混合模型增强时间序列基础模型,实现token级别自动特化。 foundation model
25 AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models 提出AlphaPruning以优化大语言模型的层级剪枝 large language model
26 SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models 提出SGLP:一种基于相似性引导的快速层划分剪枝方法,用于压缩大型深度模型。 large language model
27 Liger Kernel: Efficient Triton Kernels for LLM Training Liger Kernel:高效Triton内核加速LLM训练,降低显存占用 large language model
28 Context-Parametric Inversion: Why Instruction Finetuning Can Worsen Context Reliance 揭示指令微调中上下文-参数反转现象,并分析其成因与缓解策略 large language model
29 SLaNC: Static LayerNorm Calibration 提出静态LayerNorm校准(SLaNC)方法,解决LLM量化推理中LayerNorm计算的数值问题。 large language model
30 GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation GIFT-Eval:通用时间序列预测模型评估的综合基准 foundation model
31 Fed-pilot: Optimizing LoRA Allocation for Efficient Federated Fine-Tuning with Heterogeneous Clients Fed-pilot:优化LoRA分配,实现异构客户端高效联邦微调 foundation model
32 Is Parameter Collision Hindering Continual Learning in LLMs? N-LoRA:通过减少参数冲突提升LLM的持续学习能力 large language model
33 HSR-Enhanced Sparse Attention Acceleration 提出HSR加速方法以解决长上下文注意力计算问题 large language model
34 Tracing Human Stress from Physiological Signals using UWB Radar 提出基于UWB雷达的深度应激追踪方法DST,解决非接触式多模生理信号融合的应激状态连续检测问题。 multimodal
35 Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning 提出LARA:一种基于Logit算术重加权的上下文学习方法,提升长序列推理性能。 large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
36 Improved Depth Estimation of Bayesian Neural Networks 提出基于截断正态分布的贝叶斯神经网络深度估计方法,提升螺旋数据集精度。 depth estimation
37 Hybrid Spatial Representations for Species Distribution Modeling 提出混合空间表征方法,提升物种分布建模的空间精度 implicit representation

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
38 Feedback Favors the Generalization of Neural ODEs 提出反馈神经网络,提升神经ODE在变动潜在动力学中的泛化能力 domain randomization model predictive control latent dynamics

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
39 Echo State Networks for Spatio-Temporal Area-Level Data 提出结合图谱滤波的回声状态网络,用于提升时空区域级数据的预测精度。 spatial relationship

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
40 Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior 提出基于高斯混合模型的向量量化变分自编码器,提升码本利用率并减少信息损失。 VQ-VAE

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
41 Get Rid of Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework 提出CMuST框架,解决城市时空数据多任务连续学习问题 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页