cs.LG(2025-07-06)

📊 共 15 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (8) 支柱九:具身大模型 (Embodied Foundation Models) (7)

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
1 Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization 提出基于显示偏好和被动随机优化的逆强化学习方法,用于学习智能体效用函数。 reinforcement learning inverse reinforcement learning
2 ESSA: Evolutionary Strategies for Scalable Alignment ESSA:一种可扩展的进化策略对齐大型语言模型,无需梯度优化。 reinforcement learning PPO RLHF
3 Time2Agri: Temporal Pretext Tasks for Agricultural Monitoring Time2Agri:面向农业监测的时序自监督预训练任务 MAE contrastive learning foundation model
4 Convergence and Sample Complexity of First-Order Methods for Agnostic Reinforcement Learning 提出一种新框架以解决无最优策略的强化学习问题 reinforcement learning policy learning
5 Tractable Representation Learning with Probabilistic Circuits 提出自编码概率电路(APC),用于可解释的表征学习和鲁棒的概率推理。 representation learning distillation
6 Interactive Groupwise Comparison for Reinforcement Learning from Human Feedback 提出交互式分组比较方法,提升人机反馈强化学习效果 reinforcement learning RLHF
7 Enhancing Text-Based Hierarchical Multilabel Classification for Mobile Applications via Contrastive Learning 提出HMCN和HMCL,增强移动应用文本分层多标签分类性能 contrastive learning
8 Hierarchical Reinforcement Learning with Targeted Causal Interventions 提出基于目标因果干预的分层强化学习方法,提升长时程稀疏奖励任务效率。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
9 Model Inversion Attacks on Llama 3: Extracting PII from Large Language Models 针对Llama 3的逆向攻击揭示PII泄露风险 large language model
10 Sampling-aware Adversarial Attacks Against Large Language Models 提出采样感知对抗攻击,提升大语言模型有害响应攻击的成功率和效率。 large language model
11 Evaluating LLMs on Real-World Forecasting Against Expert Forecasters 评估LLM在真实世界预测中的表现,对比专家预测 large language model
12 DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging DOTResize:通过基于离散最优传输的神经元合并减少LLM宽度 large language model
13 Source Attribution in Retrieval-Augmented Generation 针对RAG系统,提出基于Shapley值的文档溯源方法,提升可解释性并降低计算成本。 large language model
14 LoRA Is Slower Than You Think 揭示LoRA微调并非始终加速,并提出更高效的LLM微调方法 large language model
15 Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning 提出ACTOR框架,通过激活模式微调缓解对齐语言模型过度拒绝问题 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页