cs.LG(2024-10-26)

📊 共 12 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2) 支柱一:机器人控制 (Robot Control) (3) 支柱二:RL算法与架构 (RL & Architecture) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Centaur: a foundation model of human cognition Centaur:一个预测人类认知的基础模型,可模拟多种实验场景下的人类行为。 foundation model
2 Generative AI in Health Economics and Outcomes Research: A Taxonomy of Key Definitions and Emerging Applications, an ISPOR Working Group Report 提出生成性人工智能以提升健康经济学与结果研究的效率与准确性 foundation model chain-of-thought
3 Transferable Adversarial Attacks on SAM and Its Downstream Models 提出UMI-GRAT,实现对SAM及其下游模型的可迁移对抗攻击 foundation model
4 Prompt Diffusion Robustifies Any-Modality Prompt Learning 提出Prompt Diffusion,提升任意模态Prompt Learning的鲁棒性。 foundation model
5 Library Learning Doesn't: The Curious Case of the Single-Use "Library" 揭示数学推理LLM库学习的单次使用现象,质疑其可重用性 large language model
6 Model Equality Testing: Which Model Is This API Serving? 提出模型等价性测试,用于检测黑盒API服务模型是否被篡改。 large language model
7 Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading 提出深度优化器状态以解决Transformer模型训练的内存瓶颈问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
8 Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL 利用模拟器学习探索策略,提升真实世界强化学习效率,克服Sim2Real差距 sim-to-real sim2real reinforcement learning
9 Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning 提出CoDeTr,通过建模非马尔可夫奖励解决强化学习中复合延迟奖励问题 locomotion reinforcement learning
10 Classification under strategic adversary manipulation using pessimistic bilevel optimisation 提出基于悲观双层优化的对抗样本分类方法,提升恶意数据识别的鲁棒性。 manipulation

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
11 Uncertainty-Penalized Direct Preference Optimization 提出不确定性惩罚直接偏好优化方法,提升LLM对人类偏好对齐的鲁棒性。 reinforcement learning offline reinforcement learning RLHF
12 GFlowNet Fine-tuning for Diverse Correct Solutions in Mathematical Reasoning Tasks 使用GFlowNet微调LLM,生成数学推理任务中多样化的正确解 reinforcement learning large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页