cs.LG（2024-10-14）

📊 共 41 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (19 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (16 🔗3) 支柱三：空间感知与语义 (Perception & Semantics) (2 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱四：生成式动作 (Generative Motion) (1) 支柱八：物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization	Alpha-DPO：通过自适应奖励边际优化直接偏好，提升LLM对齐效果	reinforcement learning RLHF DPO	✅
2	Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach	提出基于最大李雅普诺夫指数正则化的Dreamer V3，提升深度强化学习在连续控制任务中的鲁棒性。	reinforcement learning deep reinforcement learning dreamer
3	Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation	提出基于PackNet的持续深度强化学习方法，解决抗干扰通信中灾难性遗忘问题。	reinforcement learning deep reinforcement learning DRL
4	BrainGPT: Unleashing the Potential of EEG Generalist Foundation Model by Autoregressive Pre-training	提出BrainGPT，通过自回归预训练释放脑电图通用基础模型的潜力	masked autoencoder foundation model
5	LoLCATs: On Low-Rank Linearizing of Large Language Models	LoLCATs：通过低秩线性化方法提升大型语言模型的效率与性能	linear attention large language model
6	Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning	利用空间条件增强JEPA，实现更鲁棒高效的表征学习	representation learning masked autoencoder MAE
7	HGAurban: Heterogeneous Graph Autoencoding for Urban Spatial-Temporal Learning	提出HGAurban，利用异构图自编码器解决城市时空数据学习中的噪声和稀疏性问题。	masked autoencoder spatial relationship spatiotemporal
8	Mimetic Initialization Helps State Space Models Learn to Recall	提出一种模仿初始化方法，提升状态空间模型在记忆任务中的学习能力	Mamba state space model
9	Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning	提出分布式强化学习方法以解决高频决策中的性能问题	reinforcement learning DRL
10	The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels	揭示结构化状态空间模型易受干净标签投毒攻击的脆弱性	SSM state space model
11	Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models	提出FiFA，通过自动过滤人类反馈数据，提升文本到图像扩散模型的对齐效果。	DPO direct preference optimization large language model
12	StatioCL: Contrastive Learning for Time Series via Non-Stationary and Temporal Contrast	StatioCL：通过非平稳性和时间对比学习提升时间序列表征，解决假阴性样本问题。	representation learning contrastive learning
13	Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning	对比DCQN与DTQN在Atari游戏中性能，发现DCQN在速度和多数游戏上更优	reinforcement learning
14	Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes	提出RED框架，解决平均奖励MDP中子任务学习和风险感知强化学习问题	reinforcement learning
15	Revisiting and Benchmarking Graph Autoencoders: A Contrastive Learning Perspective	提出lrGAE：基于对比学习的图自编码器框架，为图表示学习建立新基准。	contrastive learning	✅
16	Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism	提出基于更紧致的成本悲观和奖励乐观估计的安全强化学习算法，提升Regret上界。	reinforcement learning
17	Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning	提出稳定哈达玛记忆，增强强化学习智能体在部分可观测环境下的记忆能力	reinforcement learning
18	Learning Linear Attention in Polynomial Time	提出线性注意力Transformer多项式时间可学习性理论框架，并验证其在有限自动机等任务上的有效性。	linear attention
19	Lambda-Skip Connections: the architectural component that prevents Rank Collapse	提出Lambda-Skip连接，从架构层面预防序列模型中的秩崩溃问题	SSM state space model

🔬 支柱九：具身大模型 (Embodied Foundation Models) (16 篇)

#	题目	一句话要点	标签	🔗	⭐
20	GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs	GraphCLIP：通过图-文本对比预训练增强文本属性图的迁移能力	large language model foundation model zero-shot transfer	✅
21	Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection	提出Adapt-$\infty$以解决多模态指令调优中的数据冗余问题	large language model multimodal
22	Federated Data-Efficient Instruction Tuning for Large Language Models	提出FedHDS，一种联邦数据高效指令调优方法，提升LLM在边缘侧的训练效率和泛化能力。	large language model
23	Model-based Large Language Model Customization as Service	Llamdex：一种基于模型上传的LLM定制服务，保护用户数据隐私	large language model
24	Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts	提出Moirai-MoE，利用稀疏专家混合模型增强时间序列基础模型，实现token级别自动特化。	foundation model
25	AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models	提出AlphaPruning以优化大语言模型的层级剪枝	large language model	✅
26	SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models	提出SGLP：一种基于相似性引导的快速层划分剪枝方法，用于压缩大型深度模型。	large language model
27	Liger Kernel: Efficient Triton Kernels for LLM Training	Liger Kernel：高效Triton内核加速LLM训练，降低显存占用	large language model
28	Context-Parametric Inversion: Why Instruction Finetuning Can Worsen Context Reliance	揭示指令微调中上下文-参数反转现象，并分析其成因与缓解策略	large language model
29	SLaNC: Static LayerNorm Calibration	提出静态LayerNorm校准(SLaNC)方法，解决LLM量化推理中LayerNorm计算的数值问题。	large language model
30	GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation	GIFT-Eval：通用时间序列预测模型评估的综合基准	foundation model	✅
31	Fed-pilot: Optimizing LoRA Allocation for Efficient Federated Fine-Tuning with Heterogeneous Clients	Fed-pilot：优化LoRA分配，实现异构客户端高效联邦微调	foundation model
32	Is Parameter Collision Hindering Continual Learning in LLMs?	N-LoRA：通过减少参数冲突提升LLM的持续学习能力	large language model
33	HSR-Enhanced Sparse Attention Acceleration	提出HSR加速方法以解决长上下文注意力计算问题	large language model
34	Tracing Human Stress from Physiological Signals using UWB Radar	提出基于UWB雷达的深度应激追踪方法DST，解决非接触式多模生理信号融合的应激状态连续检测问题。	multimodal
35	Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning	提出LARA：一种基于Logit算术重加权的上下文学习方法，提升长序列推理性能。	large language model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
36	Improved Depth Estimation of Bayesian Neural Networks	提出基于截断正态分布的贝叶斯神经网络深度估计方法，提升螺旋数据集精度。	depth estimation
37	Hybrid Spatial Representations for Species Distribution Modeling	提出混合空间表征方法，提升物种分布建模的空间精度	implicit representation	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
38	Feedback Favors the Generalization of Neural ODEs	提出反馈神经网络，提升神经ODE在变动潜在动力学中的泛化能力	domain randomization model predictive control latent dynamics

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
39	Echo State Networks for Spatio-Temporal Area-Level Data	提出结合图谱滤波的回声状态网络，用于提升时空区域级数据的预测精度。	spatial relationship

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
40	Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior	提出基于高斯混合模型的向量量化变分自编码器，提升码本利用率并减少信息损失。	VQ-VAE

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
41	Get Rid of Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework	提出CMuST框架，解决城市时空数据多任务连续学习问题	spatiotemporal	✅

⬅️ 返回 cs.LG 首页 · 🏠 返回主页