cs.LG(2024-12-27)

📊 共 8 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (6) 支柱九:具身大模型 (Embodied Foundation Models) (2 🔗2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
1 Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback 提出LoCo-RLHF框架,利用低秩上下文信息解决异构人类反馈中的奖励学习问题。 reinforcement learning offline reinforcement learning RLHF
2 Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning 提出基于强化学习优化的JFNK方法,加速求解二维Kuramoto-Sivashinsky方程的定点 reinforcement learning deep reinforcement learning DRL
3 Comparing Few to Rank Many: Active Human Preference Learning using Randomized Frank-Wolfe 提出随机化Frank-Wolfe算法以优化人类偏好学习 reinforcement learning preference learning
4 Enhancing Adversarial Robustness of Deep Neural Networks Through Supervised Contrastive Learning 结合监督对比学习与Margin损失,提升深度神经网络的对抗鲁棒性 contrastive learning
5 Minimax-Optimal Multi-Agent Robust Reinforcement Learning 提出Q-FTRL算法扩展至RMGs,实现minimax最优的多智能体鲁棒强化学习 reinforcement learning
6 Graph-attention-based Casual Discovery with Trust Region-navigated Clipping Policy Optimization 提出基于图注意力的因果发现方法,通过信任域引导的裁剪策略优化提升性能。 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
7 Fortran2CPP: Automating Fortran-to-C++ Translation using LLMs via Multi-Turn Dialogue and Dual-Agent Integration Fortran2CPP:利用多轮对话和双Agent集成,实现基于LLM的Fortran到C++的自动翻译 large language model
8 Gradient Weight-normalized Low-rank Projection for Efficient LLM Training 提出梯度权重归一化低秩投影(GradNormLoRP),高效训练大型语言模型。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页