cs.LG(2024-07-25)

📊 共 20 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 Advanced deep-reinforcement-learning methods for flow control: group-invariant and positional-encoding networks improve learning speed and quality 提出结合群不变网络与位置编码的深度强化学习方法,加速并提升流动控制性能。 reinforcement learning deep reinforcement learning DRL
2 Recursive Introspection: Teaching Language Model Agents How to Self-Improve 提出RISE:通过递归自省提升语言模型在复杂推理任务中的自我改进能力 reinforcement learning imitation learning large language model
3 Multi-Agent Deep Reinforcement Learning for Resilience Optimization in 5G RAN 提出基于多智能体深度强化学习的5G RAN弹性优化方案 reinforcement learning deep reinforcement learning
4 Adversarially Robust Decision Transformer 提出ARDT,通过学习最坏情况回报提升决策Transformer在对抗环境中的鲁棒性 reinforcement learning decision transformer
5 Your Graph Recommender is Provably a Single-view Graph Contrastive Learning 揭示图推荐器本质:等价于单视图图对比学习模型 representation learning contrastive learning
6 Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts 提出基于合约的主体-代理强化学习框架,协调AI个体利益与社会福利 reinforcement learning
7 How to Train the Teacher Model for Effective Knowledge Distillation 提出使用MSE训练教师模型以提升知识蒸馏效果,最高提升2.6%。 distillation
8 Peak-Controlled Logits Poisoning Attack in Federated Distillation 提出PCFDLA以解决联邦蒸馏中的投毒攻击问题 distillation
9 Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation 提出基于熵优势估计的最大熵On-Policy Actor-Critic算法,提升强化学习性能。 reinforcement learning PPO
10 Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization 提出最优Hessian/Jacobian-Free算法HJFBiO,高效解决非凸PL双层优化问题 reinforcement learning representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
11 DAM: Towards A Foundation Model for Time Series Forecasting DAM:面向时间序列预测的通用基础模型,解决多领域、非固定预测问题。 foundation model zero-shot transfer
12 On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study 探索大模型在气象科学中的应用潜力:以GPT-4o为例 foundation model multimodal
13 Large Language Model Integrated Healthcare Cyber-Physical Systems Architecture 提出集成大语言模型的医疗网络物理系统架构,提升效率和决策能力 large language model
14 Automated Ensemble Multimodal Machine Learning for Healthcare AutoPrognosis-M:用于医疗保健的自动化集成多模态机器学习框架 multimodal
15 Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow 利用新闻流微调大型语言模型以预测股票收益,提升投资组合表现。 large language model
16 Enhancing clinical decision support with physiological waveforms -- a multimodal benchmark in emergency care 提出多模态基准以增强急救中的临床决策支持 multimodal
17 LoRA-Pro: Are Low-Rank Adapters Properly Optimized? LoRA-Pro:通过优化低秩矩阵梯度,显著提升LoRA微调性能 foundation model
18 Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models 利用Token作为数据点,为更大规模语言模型的泛化边界提供更紧的理论保证。 large language model
19 Stay Tuned: An Empirical Study of the Impact of Hyperparameters on LLM Tuning in Real-World Applications 针对LLM微调,提出Coverage-based Search方法,为实际应用提供超参数配置建议。 large language model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
20 Privacy-Preserving Hierarchical Model-Distributed Inference 提出privateMDI,用于保护隐私的分层模型分布式推理加速。 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页