cs.LG(2024-10-30)

📊 共 39 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (19 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (18 🔗2) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (19 篇)

#题目一句话要点标签🔗
1 FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions FlowLLM:结合LLM与流匹配的晶体材料生成模型,显著提升稳定材料发现效率。 flow matching large language model
2 Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation 提出基于离线强化学习和序列建模的下行链路自适应方法 reinforcement learning offline RL offline reinforcement learning
3 Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning 提出Return Augmented Decision Transformer解决离线异构强化学习问题 reinforcement learning policy learning decision transformer
4 Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback 提出ONI:利用大语言模型反馈在线生成决策智能体的内在奖励 reinforcement learning large language model
5 Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval 提出LeReT框架以提升LLMs的信息检索能力 reinforcement learning large language model
6 Offline Behavior Distillation 提出离线行为蒸馏方法以提高强化学习训练效率 reinforcement learning policy learning distillation
7 DECRL: A Deep Evolutionary Clustering Jointed Temporal Knowledge Graph Representation Learning Approach DECRL:一种深度演化聚类联合时序知识图谱表示学习方法 representation learning TAMP
8 Resource Governance in Networked Systems via Integrated Variational Autoencoders and Reinforcement Learning 提出基于VAE和强化学习的资源治理框架,动态调整网络结构优化系统性能。 reinforcement learning deep reinforcement learning
9 VPO: Leveraging the Number of Votes in Preference Optimization VPO:利用投票数优化偏好,提升语言模型生成质量 reinforcement learning RLHF DPO
10 Contrastive Learning and Adversarial Disentanglement for Privacy-Aware Task-Oriented Semantic Communication 提出CLAD模型,通过对比学习和对抗解耦实现面向任务的隐私保护语义通信。 contrastive learning
11 Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm 提出基于核函数的平均奖励强化学习乐观无悔算法 reinforcement learning
12 Mechanistic Interpretability of Reinforcement Learning Agents 通过剖析强化学习智能体内部机制,揭示其决策过程与潜在偏差 reinforcement learning
13 Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise Matrix Estimation 提出LoRa-PI算法以解决低秩强化学习问题 reinforcement learning
14 Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode 提出Shadow Mode强化学习,解决物理系统训练难、易损毁问题 reinforcement learning
15 Adaptive Network Intervention for Complex Systems: A Hierarchical Graph Reinforcement Learning Approach 提出层级图强化学习框架HGRL,用于复杂多智能体系统中基于动态网络的干预治理。 reinforcement learning
16 Sequential Order-Robust Mamba for Time Series Forecasting 提出SOR-Mamba,增强Mamba模型在时间序列预测中对通道顺序的鲁棒性。 Mamba
17 Higher-order Cross-structural Embedding Model for Time Series Analysis 提出High-TS模型,通过高阶跨结构嵌入进行时间序列分析。 contrastive learning TAMP
18 Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation IsCiL:通过可检索技能增量学习实现高效的持续任务适应 imitation learning foundation model
19 COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences COMAL:一种用于对齐LLM与通用偏好的收敛元算法 reinforcement learning RLHF

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
20 Vision-Language Models Can Self-Improve Reasoning via Reflection 提出R3V框架,通过自反思CoT推理提升视觉语言模型性能 large language model multimodal chain-of-thought
21 Partial Channel Dependence with Channel Masks for Time Series Foundation Models 针对时间序列基础模型,提出基于通道掩码的部分通道依赖方法 foundation model
22 Exploring Gradient Subspaces: Addressing and Overcoming LoRA's Limitations in Federated Fine-Tuning of Large Language Models 揭示联邦微调LLM中LoRA的局限性,提出基于梯度子空间的更优方法 large language model
23 GWQ: Gradient-Aware Weight Quantization for Large Language Models 提出梯度感知权重量化(GWQ)方法,用于大语言模型低比特量化。 large language model
24 A Comprehensive Study on Quantization Techniques for Large Language Models 针对大语言模型的量化技术综述,旨在降低模型大小并加速推理。 large language model
25 Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings 提出基于语义嵌入的大语言模型不确定性量化方法,提升可靠性。 large language model
26 AI in Investment Analysis: LLMs for Equity Stock Ratings 利用大型语言模型生成股票评级,提升投资分析效率与准确性。 large language model multimodal
27 EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection 提出EF-LLM,利用AI自动化、增强稀疏预测和幻觉检测,解决能源预测难题。 large language model multimodal
28 Keep on Swimming: Real Attackers Only Need Partial Knowledge of a Multi-Model System 针对多模型系统的部分知识对抗攻击方法 foundation model
29 Tiny Transformers Excel at Sentence Compression 小型Transformer实现卓越的句子压缩性能 large language model
30 Dynamic Information Sub-Selection for Decision Support 提出动态信息子选择(DISS)框架,提升黑盒决策者的决策性能。 large language model
31 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm 提出ProTransformer即插即用模块,提升Transformer在多种任务和攻击下的鲁棒性。 large language model
32 TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters TokenFormer:通过参数Token化实现Transformer模型原生可扩展性 foundation model
33 Toward Understanding In-context vs. In-weight Learning 通过简化分布性质,揭示Transformer上下文学习涌现与消失的理论机制 large language model
34 Focus On This, Not That! Steering LLMs with Adaptive Feature Specification 提出Focus Instruction Tuning,通过自适应特征指定引导LLM行为 large language model
35 Towards Robust and Efficient Federated Low-Rank Adaptation with Heterogeneous Clients 提出LoRA-A$^2$,解决异构联邦学习中低秩适应的鲁棒性和效率问题 large language model
36 The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation 提出结合GNN嵌入的LLM电路质量评估方法,加速硬件设计迭代。 large language model
37 A Theoretical Perspective for Speculative Decoding Algorithm 从理论视角分析推测解码算法,揭示LLM组件间的内在联系 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
38 Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks Kinetix:通过开放式物理控制任务训练通用智能体 locomotion reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
39 Generative forecasting of brain activity enhances Alzheimer's classification and interpretation 利用生成式预测增强阿尔茨海默病分类与解释,基于rs-fMRI和BrainLM模型。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页