cs.LG(2024-10-27)

📊 共 15 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (4) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 PaPaGei: Open Foundation Models for Optical Physiological Signals PaPaGei:用于光学生理信号的开放式基础模型,提升PPG信号处理性能。 representation learning contrastive learning foundation model
2 Deep Reinforcement Learning Agents for Strategic Production Policies in Microeconomic Market Simulations 提出基于深度强化学习的微观经济市场生产策略优化方法 reinforcement learning deep reinforcement learning DRL
3 Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model 提出QDQ算法,通过一致性模型指导Q值分布学习,解决离线强化学习中的Q值高估问题。 reinforcement learning offline reinforcement learning
4 Accelerating Direct Preference Optimization with Prefix Sharing 提出前缀共享DPO加速方法,提升训练吞吐量且不影响收敛性。 DPO direct preference optimization
5 Generator Matching: Generative modeling with arbitrary Markov processes Generator Matching:基于任意马尔可夫过程的通用生成建模框架 flow matching multimodal
6 Uncovering Capabilities of Model Pruning in Graph Contrastive Learning 提出基于模型剪枝的图对比学习方法,提升无监督图神经网络预训练性能。 contrastive learning
7 Domain Specific Data Distillation and Multi-modal Embedding Generation 提出一种领域数据蒸馏和多模态嵌入生成方法,提升领域特定属性预测精度。 distillation
8 CloudCast -- Total Cloud Cover Nowcasting with Machine Learning CloudCast:一种基于U-Net的卷积神经网络,用于云量短期预测。 MAE optical flow
9 ThunderKittens: Simple, Fast, and Adorable AI Kernels ThunderKittens:一种简单、快速且易于维护的AI Kernel框架,提升GPU利用率。 state space model linear attention

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
10 Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse 发现思维链(CoT)在特定任务中会降低大模型性能,尤其是在人类思考反而表现更差的任务中。 multimodal chain-of-thought
11 Sequential Large Language Model-Based Hyper-parameter Optimization SLLMBO:利用大语言模型进行超参数优化,提升优化效率与鲁棒性 large language model
12 Deep Learning-Driven Microstructure Characterization and Vickers Hardness Prediction of Mg-Gd Alloys 提出基于深度学习的多模态融合框架,用于预测Mg-Gd合金的维氏硬度。 multimodal
13 Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences 提出非参数层次变量学习模型以提升序列学习效率 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
14 Efficient Diversity-based Experience Replay for Deep Reinforcement Learning 提出基于多样性的高效经验回放EDER,提升高维强化学习效率 manipulation reinforcement learning deep reinforcement learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
15 TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation TabDiff:混合类型扩散模型用于表格数据生成,显著提升数据质量。 classifier-free guidance

⬅️ 返回 cs.LG 首页 · 🏠 返回主页