cs.LG(2025-05-31)

📊 共 32 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (14 🔗3)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training 提出QoQ-Med以解决临床多模态数据推理问题 reinforcement learning foundation model multimodal
2 A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning for Any Atlas and Disorder 提出BrainGFM:基于图的脑图谱基础模型,用于多种脑疾病诊断与脑区划分。 masked autoencoder contrastive learning large language model
3 MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning MMedAgent-RL:基于强化学习的多智能体协作优化多模态医学推理 reinforcement learning curriculum learning multimodal
4 From Rules to Rewards: Reinforcement Learning for Interest Rate Adjustment in DeFi Lending 应用离线强化学习优化DeFi借贷利率调整 reinforcement learning TD3 offline reinforcement learning
5 Adaptive Plane Reformatting for 4D Flow MRI using Deep Reinforcement Learning 提出AdaPR,一种基于深度强化学习的自适应平面重构方法,用于解决4D流MRI的通用性问题。 reinforcement learning deep reinforcement learning DRL
6 Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing 提出Prompt-Tuned LLM增强DRL方法,用于动态O-RAN网络切片资源分配。 reinforcement learning deep reinforcement learning DRL
7 A New Spatiotemporal Correlation Anomaly Detection Method that Integrates Contrastive Learning and Few-Shot Learning in Wireless Sensor Networks 提出MTAD-RD模型,解决无线传感器网络时空异常检测中特征提取和样本不平衡问题 contrastive learning spatiotemporal
8 Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning 提出非线性注意力机制,加速强化学习中置换不变神经网络的收敛 reinforcement learning linear attention
9 RLAE: Reinforcement Learning-Assisted Ensemble for LLMs 提出RLAE:强化学习辅助的大语言模型集成框架,提升模型性能。 reinforcement learning PPO large language model
10 Dynamic Domain Adaptation-Driven Physics-Informed Graph Representation Learning for AC-OPF 提出DDA-PIGCN,解决AC-OPF中复杂约束建模与时空信息融合难题。 representation learning MAE spatiotemporal
11 Optimized Local Updates in Federated Learning via Reinforcement Learning 提出基于强化学习的联邦学习局部更新优化方法,提升非独立同分布数据下的模型性能。 reinforcement learning deep reinforcement learning DRL
12 ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network Slicing ORAN-GUIDE:基于RAG的提示学习,用于O-RAN网络切片中LLM增强的强化学习 reinforcement learning deep reinforcement learning DRL
13 Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments 大规模研究揭示行为度量学习在干扰强化学习环境中的作用机制 reinforcement learning deep reinforcement learning
14 Reinforcement Learning for Hanabi 探索强化学习算法在花火游戏中智能体协作策略 reinforcement learning deep reinforcement learning
15 CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries 提出CLARIFY,通过对比偏好学习解决强化学习中模糊查询问题 reinforcement learning contrastive learning
16 Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn 提出C-CHAIN方法,通过减少Churn来缓解持续强化学习中的可塑性损失 reinforcement learning
17 AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs 提出AutoMixAlign以解决多任务偏好优化问题 DPO large language model
18 Comparing Traditional and Reinforcement-Learning Methods for Energy Storage Control 对比传统方法与强化学习在储能控制中的性能,揭示不同场景下的适用性。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
19 A Foundation Model for Non-Destructive Defect Identification from Vibrational Spectra 提出DefectNet,通过振动谱实现材料中缺陷的无损识别与定量分析。 foundation model
20 Existing Large Language Model Unlearning Evaluations Are Inconclusive 揭示大语言模型卸载评估的局限性,提出更可靠的评估原则 large language model
21 Probabilistic Forecasting for Building Energy Systems using Time-Series Foundation Models 利用时序基础模型进行建筑能源系统概率预测,提升数据受限场景下的预测精度。 foundation model
22 M2WLLM: Multi-Modal Multi-Task Ultra-Short-term Wind Power Prediction Algorithm Based on Large Language Model M2WLLM:基于大语言模型的多模态多任务超短期风电功率预测算法 large language model
23 Spectral Insights into Data-Oblivious Critical Layers in Large Language Models 提出数据无关方法识别LLM中的关键层,提升模型可解释性和鲁棒性 large language model
24 Power-of-Two (PoT) Weights in Large Language Models (LLMs) 提出基于二次幂量化的LLM压缩方法,降低计算复杂度 large language model
25 Pitfalls in Evaluating Language Model Forecasters 揭示大语言模型预测评估中的陷阱,呼吁更严谨的评估方法 large language model
26 Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large Models 提出线性表示可转移假设以引导大模型行为 large language model
27 FLoE: Fisher-Based Layer Selection for Efficient Sparse Adaptation of Low-Rank Experts FLoE:基于Fisher信息的层选择低秩专家高效稀疏适配 large language model
28 It Takes a Good Model to Train a Good Model: Generalized Gaussian Priors for Optimized LLMs 基于广义高斯先验的LLM优化框架,实现高效、可扩展和硬件友好的AI系统 large language model
29 BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation 提出BenchHub以解决LLM评估标准化问题 large language model
30 Revisiting LLMs as Zero-Shot Time-Series Forecasters: Small Noise Can Break Large Models 研究表明:大语言模型作为零样本时间序列预测器时,对噪声敏感,性能不佳 large language model
31 Channel Normalization for Time Series Channel Identification 提出通道归一化(CN)方法,增强时间序列模型中通道可辨识性,提升模型性能。 foundation model
32 BatteryBERT for Realistic Battery Fault Detection Using Point-Masked Signal Modeling BatteryBERT:基于点掩码信号建模的电池故障检测方法 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页