cs.LG（2024-10-27）

📊 共 15 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (9 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (4) 支柱一：机器人控制 (Robot Control) (1) 支柱四：生成式动作 (Generative Motion) (1 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
1	PaPaGei: Open Foundation Models for Optical Physiological Signals	PaPaGei：用于光学生理信号的开放式基础模型，提升PPG信号处理性能。	representation learning contrastive learning foundation model
2	Deep Reinforcement Learning Agents for Strategic Production Policies in Microeconomic Market Simulations	提出基于深度强化学习的微观经济市场生产策略优化方法	reinforcement learning deep reinforcement learning DRL
3	Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model	提出QDQ算法，通过一致性模型指导Q值分布学习，解决离线强化学习中的Q值高估问题。	reinforcement learning offline reinforcement learning
4	Accelerating Direct Preference Optimization with Prefix Sharing	提出前缀共享DPO加速方法，提升训练吞吐量且不影响收敛性。	DPO direct preference optimization	✅
5	Generator Matching: Generative modeling with arbitrary Markov processes	Generator Matching：基于任意马尔可夫过程的通用生成建模框架	flow matching multimodal
6	Uncovering Capabilities of Model Pruning in Graph Contrastive Learning	提出基于模型剪枝的图对比学习方法，提升无监督图神经网络预训练性能。	contrastive learning
7	Domain Specific Data Distillation and Multi-modal Embedding Generation	提出一种领域数据蒸馏和多模态嵌入生成方法，提升领域特定属性预测精度。	distillation
8	CloudCast -- Total Cloud Cover Nowcasting with Machine Learning	CloudCast：一种基于U-Net的卷积神经网络，用于云量短期预测。	MAE optical flow	✅
9	ThunderKittens: Simple, Fast, and Adorable AI Kernels	ThunderKittens：一种简单、快速且易于维护的AI Kernel框架，提升GPU利用率。	state space model linear attention

🔬 支柱九：具身大模型 (Embodied Foundation Models) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
10	Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse	发现思维链（CoT）在特定任务中会降低大模型性能，尤其是在人类思考反而表现更差的任务中。	multimodal chain-of-thought
11	Sequential Large Language Model-Based Hyper-parameter Optimization	SLLMBO：利用大语言模型进行超参数优化，提升优化效率与鲁棒性	large language model
12	Deep Learning-Driven Microstructure Characterization and Vickers Hardness Prediction of Mg-Gd Alloys	提出基于深度学习的多模态融合框架，用于预测Mg-Gd合金的维氏硬度。	multimodal
13	Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences	提出非参数层次变量学习模型以提升序列学习效率	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
14	Efficient Diversity-based Experience Replay for Deep Reinforcement Learning	提出基于多样性的高效经验回放EDER，提升高维强化学习效率	manipulation reinforcement learning deep reinforcement learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
15	TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation	TabDiff：混合类型扩散模型用于表格数据生成，显著提升数据质量。	classifier-free guidance	✅

⬅️ 返回 cs.LG 首页 · 🏠 返回主页