cs.LG(2024-05-31)

📊 共 28 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought 提出In-context Decision Transformer,通过分层思维链加速离线强化学习。 reinforcement learning offline reinforcement learning decision transformer
2 Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling 提出Decision Mamba-Hybrid,结合Transformer和Mamba优势,提升强化学习长时序决策效率。 reinforcement learning decision transformer Mamba
3 Generative AI for Deep Reinforcement Learning: Framework, Analysis, and Use Cases 提出GAI增强的DRL框架,提升DRL在复杂环境下的样本效率和泛化能力 reinforcement learning deep reinforcement learning DRL
4 Mamba State-Space Models Are Lyapunov-Stable Learners Mamba状态空间模型:Lyapunov稳定性保障下的稳健学习 Mamba SSM large language model
5 Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF 提出XPO算法,通过隐式Q*-近似实现RLHF中的高效探索偏好优化。 reinforcement learning RLHF DPO
6 Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning 提出Diffusion Actor-Critic,通过扩散噪声回归解决离线强化学习中的策略约束问题 reinforcement learning offline reinforcement learning
7 Amortizing intractable inference in diffusion models for vision, language, and control 提出相对轨迹平衡以解决扩散模型后验推断问题 reinforcement learning deep reinforcement learning offline reinforcement learning
8 LInK: Learning Joint Representations of Design and Performance Spaces through Contrastive Learning for Mechanism Synthesis LInK:通过对比学习设计与性能空间联合表示,用于机构综合 contrastive learning multimodal
9 Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality 通过结构化状态空间对偶性,统一Transformer和SSM,并提出高效算法。 Mamba SSM
10 Bayesian Design Principles for Offline-to-Online Reinforcement Learning 提出基于贝叶斯设计的离线到在线强化学习方法,解决策略优化中的悲观/乐观困境。 reinforcement learning offline reinforcement learning
11 Flow matching achieves almost minimax optimal convergence 提出流匹配方法以实现几乎最优收敛性 flow matching
12 Reinforcement Learning for Sociohydrology 提出基于强化学习的社会水文学框架,解决土地利用管理中的径流控制问题 reinforcement learning
13 Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation 提出MIPE以解决抗体-抗原结合位点预测问题 contrastive learning
14 Heterophilous Distribution Propagation for Graph Neural Networks 提出异质性分布传播(HDP)图神经网络,解决异质图中的节点表征学习问题。 representation learning contrastive learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
15 Improved Techniques for Optimization-Based Jailbreaking on Large Language Models I-GCG:通过改进优化技术提升大语言模型的越狱攻击效率 large language model
16 Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise 针对结构化噪声下 spiked 矩阵模型,提出基于 TAP 方程的信息论极限逼近算法 PaLM-E
17 LCQ: Low-Rank Codebook based Quantization for Large Language Models 提出基于低秩码本量化的LCQ方法,用于压缩大型语言模型 large language model
18 LOLAMEME: Logic, Language, Memory, Mechanistic Framework 提出LOLAMEME框架,用于逻辑、语言和记忆机制化理解大型语言模型 large language model
19 Query2CAD: Generating CAD models using natural language queries Query2CAD:利用自然语言查询生成CAD模型,无需监督数据和额外训练。 large language model
20 QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation 提出QuanTA,利用量子启发张量适配实现高效的大模型高秩微调。 large language model
21 From Unstructured Data to In-Context Learning: Exploring What Tasks Can Be Learned and When 研究揭示非结构化数据训练的LLM上下文学习能力及其局限性 large language model
22 Scalable Bayesian Learning with posteriors 提出posteriors库,结合tempered SGMCMC和改进的深度集成,实现可扩展的贝叶斯学习。 large language model
23 Effective Interplay between Sparsity and Quantization: From Theory to Practice 揭示稀疏化与量化非正交性,优化大模型压缩策略 large language model
24 Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs 校准集和异常值对现代LLM量化的影响减弱:关注推理速度优化 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
25 Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation 提出ESPO,通过样本操控提升安全强化学习的效率 manipulation reinforcement learning
26 A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning 提出XAIsim2real,通过模拟用户研究优化可解释机器学习中的任务相关属性识别。 sim2real

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
27 Streamflow Prediction with Uncertainty Quantification for Water Management: A Constrained Reasoning and Learning Approach 提出一种结合约束推理学习的河流流量预测方法,并量化不确定性,用于水资源管理。 spatiotemporal
28 Waveform Design for Over-the-Air Computing 针对无线计算中同步误差和符号间干扰,提出基于深度学习的波形设计方法。 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页