cs.LG(2024-10-25)

📊 共 23 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting 提出RL微调LLM Agent框架,分析并缓解交互环境中Prompt过拟合问题 reinforcement learning large language model
2 Enhancing Battery Storage Energy Arbitrage with Deep Reinforcement Learning and Time-Series Forecasting 结合深度强化学习与时间序列预测,提升电池储能套利收益 reinforcement learning deep reinforcement learning DRL
3 Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression 提出SCAS,统一OOD状态校正与OOD动作抑制,提升离线强化学习性能 reinforcement learning offline RL offline reinforcement learning
4 Random Policy Enables In-Context Reinforcement Learning within Trust Horizons 提出State-Action Distillation (SAD),实现基于随机策略的上下文强化学习。 reinforcement learning distillation foundation model
5 Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization 提出RePO算法,通过矫正策略优化提升强化学习中基于人类反馈的安全性 reinforcement learning large language model
6 Multi-Agent Reinforcement Learning with Selective State-Space Models 提出多智能体Mamba(MAM),在多智能体强化学习中实现与Transformer媲美的性能和更优的可扩展性。 reinforcement learning Mamba SSM
7 Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization 利用多样性正则化直接偏好优化改进肽设计的反向折叠 DPO direct preference optimization
8 Provably Adaptive Average Reward Reinforcement Learning for Metric Spaces 提出ZoRL算法以解决Lipschitz MDPs的平均奖励强化学习问题 reinforcement learning
9 Temporal Convolution-based Hybrid Model Approach with Representation Learning for Real-Time Acoustic Anomaly Detection 提出基于时序卷积和表征学习的混合模型,用于实时声学异常检测。 representation learning
10 Privacy-Preserving Federated Learning via Dataset Distillation 提出FLiP:一种基于数据集蒸馏的隐私保护联邦学习方法 distillation
11 AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design AgentForge:一个灵活的低代码强化学习Agent设计平台 reinforcement learning
12 Toward Finding Strong Pareto Optimal Policies in Multi-Agent Reinforcement Learning 提出MGDA++以解决多智能体强化学习中的帕累托最优策略问题 reinforcement learning
13 Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors 提出基于强化学习的代理攻击方法,有效欺骗LLM检测器,同时保持生成质量。 reinforcement learning large language model
14 Adversarial Environment Design via Regret-Guided Diffusion Models 提出基于遗憾引导扩散模型的对抗环境设计方法,提升强化学习鲁棒性 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
15 Evaluating Cost-Accuracy Trade-offs in Multimodal Search Relevance Judgements 评估多模态搜索相关性判断中成本-准确率的权衡 large language model multimodal
16 Conformal Prediction for Multimodal Regression 提出多模态一致性预测回归方法,扩展一致性预测至图像和文本等多模态数据场景。 multimodal
17 Computational Bottlenecks of Training Small-scale Large Language Models 研究小规模大语言模型训练的计算瓶颈,优化低资源AI研究机构的模型训练。 large language model
18 Measuring memorization in language models via probabilistic extraction 提出概率可发现抽取方法,更可靠地评估语言模型中的记忆化风险。 large language model
19 Learned Reference-based Diffusion Sampling for multi-modal distributions 提出LRDS:一种基于学习参考的扩散采样方法,用于多模态分布。 multimodal
20 COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training COAT:通过压缩优化器状态和激活,实现内存高效的FP8训练 large language model
21 Neuralink: Fast LLM Inference on Smartphones with Neuron Co-Activation Linking 提出Neuralink以优化智能手机上的大语言模型推理 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
22 Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems 提出Air-DualODE,利用物理引导的双神经ODE预测开放系统中的空气质量。 spatiotemporal
23 On the Application of Deep Learning for Precise Indoor Positioning in 6G 提出LocNet,利用深度学习提升6G室内工厂环境的定位精度 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页