cs.LG(2024-10-28)

📊 共 33 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (13 🔗4) 支柱一:机器人控制 (Robot Control) (1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces BraVE:用于离散组合动作空间的离线强化学习方法 reinforcement learning offline RL offline reinforcement learning
2 Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and Replenishment 提出双Agent深度强化学习算法,解决动态定价与补货的决策频率不一致问题 reinforcement learning deep reinforcement learning DRL
3 Unveiling the Role of Expert Guidance: A Comparative Analysis of User-centered Imitation Learning and Traditional Reinforcement Learning 对比模仿学习与强化学习,揭示专家指导在智能系统中的作用 reinforcement learning imitation learning
4 FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system 提出FALCON,利用反馈驱动的自适应长短期记忆强化编码优化系统,提升代码生成质量。 reinforcement learning RLHF large language model
5 Beyond Autoregression: Fast LLMs via Self-Distillation Through Time 提出基于时序自蒸馏的快速扩散语言模型,显著提升生成速度与文本质量。 distillation large language model
6 Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment 提出Faster WIND加速LLM对齐,提升迭代Best-of-$N$蒸馏效率 distillation large language model
7 Robustness and Generalization in Quantum Reinforcement Learning via Lipschitz Regularization 提出RegQPG算法,通过Lipschitz正则化提升量子强化学习的鲁棒性和泛化性 reinforcement learning curriculum learning
8 The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure 针对状态空间大的强化学习,提出基于潜在低秩结构的迁移强化学习方法 reinforcement learning
9 A Multi-Agent Reinforcement Learning Testbed for Cognitive Radio Applications 扩展RFRL Gym,实现多智能体强化学习在认知无线电应用中的测试与评估。 reinforcement learning
10 Flow Matching for Atmospheric Retrieval of Exoplanets: Where Reliability meets Adaptive Noise Levels 提出基于Flow Matching的行星大气反演方法,提升可靠性与适应性。 flow matching
11 Foundations of Safe Online Reinforcement Learning in the Linear Quadratic Regulator: Generalized Baselines 提出安全在线强化学习框架以解决线性二次调节器问题 reinforcement learning
12 Disentangled and Self-Explainable Node Representation Learning 提出DiSeNE框架,用于生成可解释的解耦节点表示,提升图数据的可理解性。 representation learning
13 SepMamba: State-space models for speaker separation using Mamba SepMamba:利用Mamba的状态空间模型进行语音分离 Mamba
14 ODRL: A Benchmark for Off-Dynamics Reinforcement Learning ODRL:提出首个针对异构动力学强化学习的综合性基准测试平台 reinforcement learning
15 Identifying Selections for Unsupervised Subtask Discovery 提出基于选择机制的无监督子任务发现方法,提升多任务模仿学习泛化能力。 reinforcement learning imitation learning
16 Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient 提出基于策略梯度的RL-V2V-GAN,用于少样本视频到视频的生成。 reinforcement learning deep reinforcement learning
17 Getting By Goal Misgeneralization With a Little Help From a Mentor 提出一种基于导师辅助的强化学习方法,缓解目标泛化性缺失问题 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (13 篇)

#题目一句话要点标签🔗
18 AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification AiSciVision:一个用于科学图像分类的大型多模态模型专业化框架 multimodal
19 L3Ms -- Lagrange Large Language Models 提出L3Ms,通过拉格朗日方法实现大语言模型面向特定应用场景的定制化对齐。 large language model
20 Flaming-hot Initiation with Regular Execution Sampling for Large Language Models 提出FIRE采样方法,高效提升大语言模型在推理任务中的生成质量。 large language model
21 Large Language Model-Guided Prediction Toward Quantum Materials Synthesis 利用大语言模型预测量子材料合成路径,加速新材料发现 large language model
22 Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models 提出Shopping MMLU:大规模多任务在线购物基准,评估LLM在电商场景的应用潜力。 large language model
23 Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification 提出加权损失方法,提升LLM生成数据在文本分类中的利用率 large language model
24 LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation 提出LLM-Forest框架,通过图增强提示集成LLM进行数据插补 large language model
25 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference ShadowKV:面向高吞吐长文本LLM推理的低秩KV缓存与动态重构 large language model
26 BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference BLAST:用于高效深度神经网络推理的块级自适应结构化矩阵 foundation model
27 LoRA vs Full Fine-tuning: An Illusion of Equivalence 揭示LoRA与全参数微调的差异:通过奇异值分解发现“入侵维度” large language model
28 LLM-initialized Differentiable Causal Discovery 提出LLM-DCD,利用大语言模型初始化可微因果发现,提升因果关系推断准确性。 large language model
29 Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation 提出概念引导的国际象棋评论生成方法,弥合专家模型与语言模型差距 large language model
30 Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs 提出Matryoshka Pilot,利用轻量级LLM控制器驱动黑盒LLM,提升复杂任务处理能力。 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
31 Adversarial Constrained Policy Optimization: Improving Constrained Reinforcement Learning by Adapting Budgets 提出对抗约束策略优化(ACPO),通过自适应预算改进约束强化学习。 quadruped locomotion reinforcement learning

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
32 On Homomorphic Encryption Based Strategies for Class Imbalance in Federated Learning 提出FLICKER,一种基于同态加密的联邦学习不平衡类问题解决方案 OMOMO

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
33 Strada-LLM: Graph LLM for traffic prediction Strada-LLM:用于交通预测的图LLM,提升预测精度和效率。 spatiotemporal large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页