cs.LG(2024-11-07)

📊 共 29 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一:机器人控制 (Robot Control) (3) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Hypercube Policy Regularization Framework for Offline Reinforcement Learning 提出超立方体策略正则化框架,提升离线强化学习在低质量数据集上的性能 reinforcement learning TD3 offline reinforcement learning
2 Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning 提出约束潜在动作策略以解决离线强化学习中的样本外问题 reinforcement learning policy learning offline reinforcement learning
3 Pruning the Path to Optimal Care: Identifying Systematically Suboptimal Medical Decision-Making with Inverse Reinforcement Learning 利用逆强化学习识别ICU中系统性次优医疗决策 reinforcement learning inverse reinforcement learning
4 Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations 提出基于事后重生的强化学习方法,提升交互式对话Agent在心理健康支持和慈善捐赠场景下的表现。 reinforcement learning offline reinforcement learning large language model
5 Sharp Analysis for KL-Regularized Contextual Bandits and RLHF 提出KL正则化的锐利分析以优化上下文赌博机和人类反馈强化学习 reinforcement learning policy learning offline RL
6 Towards Improved Preference Optimization Pipeline: from Data Generation to Budget-Controlled Regularization 提出一种改进的偏好优化流程,通过优化数据生成和预算控制正则化提升LLM对齐效果 DPO direct preference optimization large language model
7 Scaling Laws for Pre-training Agents and World Models 揭示预训练Agent和World Model的Scaling Laws,优化模型规模与数据配比 imitation learning world model
8 Performative Reinforcement Learning with Linear Markov Decision Process 针对线性MDP下的Performative强化学习,提出重复正则化优化方法并证明其收敛性。 reinforcement learning
9 Watermarking Language Models through Language Models 提出一种基于提示的语言模型水印框架,无需访问模型内部即可实现溯源与监管。 distillation large language model
10 Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement Fed-LDR:联邦学习框架下融合局部数据的图神经网络,用于城市时空数据分析。 MAE spatial relationship
11 Semantic-Aware Resource Management for C-V2X Platooning via Multi-Agent Reinforcement Learning 提出语义感知资源管理方法以解决C-V2X车队通信问题 reinforcement learning
12 Comparing Fairness of Generative Mobility Models 提出评估生成式出行模型公平性的框架,揭示模型精度与公平性的权衡。 predictive model spatiotemporal
13 Solving Hidden Monotone Variational Inequalities with Surrogate Losses 提出基于替代损失的算法,解决深度学习中隐藏单调变分不等式问题 reinforcement learning deep reinforcement learning
14 Zero-Shot Temporal Resolution Domain Adaptation for Spiking Neural Networks 提出零样本时序分辨率域适应方法,解决SNN在不同时序分辨率数据下的性能下降问题。 SSM state space model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
15 Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs 研究多模态LLM中分子图谱分层表示,揭示现有模型对图特征理解不足 large language model multimodal
16 OneProt: Towards Multi-Modal Protein Foundation Models OneProt:面向蛋白质的多模态基础模型,融合结构、序列、文本和结合位点数据。 foundation model
17 Benchmarking Large Language Models with Integer Sequence Generation Tasks 提出整数序列生成基准,评估大语言模型在数学推理和代码合成中的能力。 large language model
18 Hardware and Software Platform Inference 提出硬件和软件平台推断方法,用于识别黑盒机器学习模型的底层GPU架构和软件栈。 large language model
19 Measure-to-measure interpolation using Transformers 提出基于Transformer的度量到度量插值方法,实现任意输入输出度量的映射。 large language model
20 LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation Combining Hierarchical Agents and RAG LLM-R:结合分层Agent与RAG的领域自适应维护方案生成框架 large language model
21 Towards Unifying Interpretability and Control: Evaluation via Intervention 提出基于干预的评估框架,统一评估和控制大语言模型的可解释性方法。 large language model
22 Variational Low-Rank Adaptation Using IVON 提出基于IVON的变分低秩自适应方法,提升LoRA精度与校准度 large language model
23 Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method 提出评估指标,研究梯度方法下LLM对分布内外数据的不可学习性差异 large language model
24 Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation 提出MonteCLoRA,通过贝叶斯重参数化低秩适配实现LLM的鲁棒高效微调 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
25 Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping 评估强化学习算法在自主航运中的鲁棒性,验证SAC在内河航运环境中的有效性。 motion planning reinforcement learning deep reinforcement learning
26 Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning 提出Q-SFT算法,将Q学习转化为监督微调,提升语言模型在多轮强化学习任务中的性能。 manipulation reinforcement learning offline RL
27 Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity DIVA:通过目标多样性在开放式模拟器中实现自适应Agent训练 domain randomization reinforcement learning

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
28 Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder 生成式MLLM利用相同视觉编码器超越CLIP,揭示视觉信息提取的关键架构设计 spatial relationship large language model multimodal

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
29 TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model TrajGPT:提出一种基于Transformer的多任务时空模型,用于可控的合成轨迹生成。 spatiotemporal large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页