cs.LG（2024-11-18）

📊 共 27 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (10) 支柱一：机器人控制 (Robot Control) (1) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT	MMBind：利用分布式异构数据进行物联网多模态学习	contrastive learning foundation model multimodal	✅
2	Preserving Expert-Level Privacy in Offline Reinforcement Learning	提出一种共识专家级差分隐私离线强化学习方法，保护专家隐私。	reinforcement learning offline RL offline reinforcement learning
3	METEOR: Evolutionary Journey of Large Language Models from Guidance to Self-Growth	提出METEOR方法，引导大语言模型从指导学习到自主进化	distillation large language model
4	Dissecting Representation Misalignment in Contrastive Learning via Influence Function	提出ECIF：通过扩展影响函数解决对比学习中表征错位问题	contrastive learning multimodal
5	EXCON: Extreme Instance-based Contrastive Representation Learning of Severely Imbalanced Multivariate Time Series for Solar Flare Prediction	EXCON：基于极端实例对比学习的太阳耀斑预测方法，解决严重不平衡多元时间序列问题	predictive model representation learning contrastive learning
6	Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework	构建人类反馈强化学习空间：提出概念框架以统一反馈类型和质量评估。	reinforcement learning RLHF
7	Theoretical Corrections and the Leveraging of Reinforcement Learning to Enhance Triangle Attack	提出基于强化学习的三角攻击TARL，提升黑盒对抗攻击效率。	reinforcement learning
8	Robust Reinforcement Learning under Diffusion Models for Data with Jumps	提出MSBVE算法，增强强化学习在跳跃扩散模型下的鲁棒性与收敛性	reinforcement learning
9	Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets	Value Imprint：一种审计RLHF数据集中嵌入人类价值观的技术	RLHF
10	Near-Optimal Reinforcement Learning with Shuffle Differential Privacy	提出SDP-PE算法，在Shuffle差分隐私下实现近优强化学习，解决网络系统隐私泄露问题。	reinforcement learning
11	Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning	提出基于时序高斯混合模型的结构学习方法，用于模型驱动的强化学习。	reinforcement learning
12	Continual Task Learning through Adaptive Policy Self-Composition	提出CompoFormer，通过自适应策略组合解决离线持续强化学习中的灾难性遗忘问题	reinforcement learning offline RL offline reinforcement learning
13	Aligning Few-Step Diffusion Models with Dense Reward Difference Learning	提出SDPO，通过密集奖励差异学习对齐少步扩散模型，提升步泛化能力	reinforcement learning diffusion policy	✅
14	Reinforced Symbolic Learning with Logical Constraints for Predicting Turbine Blade Fatigue Life	提出基于强化学习的符号学习方法RSL，用于预测涡轮叶片疲劳寿命	reinforcement learning deep reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
15	LLM-IE: A Python Package for Generative Information Extraction with Large Language Models	LLM-IE：用于生成式信息抽取的Python软件包，交互式LLM Agent辅助流程构建。	large language model
16	Unveiling and Addressing Pseudo Forgetting in Large Language Models	揭示并解决大语言模型中的伪遗忘现象，提升持续学习能力	large language model
17	Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers	研究Transformer多层结构中堆叠注意力头的机制与涌现现象	large language model
18	Random Forest-Supervised Manifold Alignment	提出基于随机森林监督的流形对齐方法，提升跨域分类任务性能。	multimodal
19	Tackling prediction tasks in relational databases with LLMs	利用大型语言模型解决关系数据库中的预测任务	large language model
20	Parallelly Tempered Generative Adversarial Nets: Toward Stabilized Gradients	提出并行退火GAN，通过稳定梯度解决GAN训练中的模式崩塌问题。	multimodal
21	BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration	BitMoD：一种面向低精度LLM加速的混合数据类型位串行算法-硬件协同设计方案	large language model
22	TSINR: Capturing Temporal Continuity via Implicit Neural Representations for Time Series Anomaly Detection	提出TSINR，利用隐式神经表示捕捉时间连续性，用于时间序列异常检测。	large language model
23	Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions	提出一种中间件架构，在资源受限场景下预判差分隐私文本清洗对LLM效用的影响，避免资源浪费。	large language model
24	Re-examining learning linear functions in context	研究表明Transformer在上下文学习线性函数时，未采用线性回归等算法方法。	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Bridging the Resource Gap: Deploying Advanced Imitation Learning Models onto Affordable Embedded Platforms	提出一种高效迁移方案，将先进模仿学习模型部署到低成本嵌入式平台	manipulation imitation learning

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
26	Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting	提出基于PCA嵌入的交通预测模型，提升时空图神经网络的泛化能力。	spatial relationship spatiotemporal

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Exploring Eye Tracking to Detect Cognitive Load in Complex Virtual Reality Training	利用眼动追踪技术检测复杂VR训练中的认知负荷	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页