cs.LG(2025-09-11)

📊 共 16 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Feasibility-Guided Fair Adaptive Offline Reinforcement Learning for Medicaid Care Management 提出可行性引导的公平自适应离线强化学习,用于改善医疗补助计划管理。 reinforcement learning offline RL offline reinforcement learning
2 Hybrid Adaptive Conformal Offline Reinforcement Learning for Fair Population Health Management 提出混合自适应保形离线强化学习(HACO)框架,用于公平的人群健康管理。 reinforcement learning offline RL offline reinforcement learning
3 Vejde: A Framework for Inductive Deep Reinforcement Learning Based on Factor Graph Color Refinement Vejde:基于因子图着色优化的归纳深度强化学习框架,解决结构化状态决策问题。 reinforcement learning deep reinforcement learning
4 Quantum-Enhanced Forecasting for Deep Reinforcement Learning in Algorithmic Trading 提出基于量子增强深度强化学习的算法交易方法,实现外汇交易回报率提升。 reinforcement learning deep reinforcement learning
5 Revisiting Actor-Critic Methods in Discrete Action Off-Policy Reinforcement Learning 解耦Actor-Critic熵正则化,提升离散动作离策略强化学习性能 reinforcement learning PPO SAC
6 Meta-Learning Reinforcement Learning for Crypto-Return Prediction 提出Meta-RL-Crypto,用于加密货币回报预测的自改进交易Agent reinforcement learning multimodal
7 Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents 提出熵调制策略梯度以解决长时间任务中的奖励稀疏问题 reinforcement learning inverse reinforcement learning large language model
8 Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning 提出基于物理信息神经网络的连续时间多智能体强化学习框架,解决高频交互和维度灾难问题。 reinforcement learning policy learning
9 Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates NeuCodec:基于有限标量量化的鲁棒性神经音频压缩编码 distillation large language model
10 Incentivizing Safer Actions in Policy Optimization for Constrained Reinforcement Learning 提出IP3O算法,通过自适应激励机制提升约束强化学习策略优化中的安全性。 reinforcement learning
11 Clip Your Sequences Fairly: Enforcing Length Fairness for Sequence-Level RL 提出FSPO,通过长度公平裁剪解决序列级强化学习中的长度偏差问题 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
12 Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework Safe-SAIL:通过稀疏自编码器解释框架实现大语言模型细粒度安全分析 large language model
13 Latency and Token-Aware Test-Time Compute 提出动态计算分配框架以优化大语言模型推理性能 large language model
14 One Head, Many Models: Cross-Attention Routing for Cost-Aware LLM Selection 提出基于单头交叉注意力路由的LLM选择框架,实现成本效益优化。 large language model
15 ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms 提出ButterflyQuant以解决超低比特LLM量化问题 large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
16 Graph Alignment via Dual-Pass Spectral Encoding and Latent Space Communication 提出双通道谱编码与潜在空间通信的图对齐框架,提升节点区分性与几何一致性。 geometric consistency

⬅️ 返回 cs.LG 首页 · 🏠 返回主页