cs.LG(2025-06-11)
📊 共 13 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (6)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱一:机器人控制 (Robot Control) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models | 提出Athena-PRM以高效解决多模态推理中的奖励模型问题 | multimodal | ||
| 2 | FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models | 提出FedVLMBench以解决联邦学习下视觉-语言模型微调评估问题 | foundation model multimodal | ||
| 3 | AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent | 提出AWP方法以解决大语言模型的压缩问题 | large language model | ||
| 4 | Prompt Variability Effects On LLM Code Generation | 提出合成评估管道以量化LLM代码生成的提示变异性影响 | large language model | ||
| 5 | Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling | 提出口头拒绝采样以解决LLM采样偏差问题 | large language model | ||
| 6 | Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform | 提出Stained Glass Transform以解决LLM嵌入序列隐私问题 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs | 提出Omni-DPO以解决动态偏好学习中的数据利用问题 | reinforcement learning preference learning RLHF | ✅ | |
| 8 | Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design | 提出基于偏好的高效强化学习方法以解决查询选择问题 | reinforcement learning | ||
| 9 | Probabilistic Variational Contrastive Learning | 提出变分对比学习以解决不确定性量化问题 | contrastive learning | ||
| 10 | Attention on flow control: transformer-based reinforcement learning for lift regulation in highly disturbed flows | 提出基于变压器的强化学习以解决强干扰流中的升力调节问题 | reinforcement learning | ||
| 11 | Canonical Latent Representations in Conditional Diffusion Models | 提出CLAReps以解决条件扩散模型中的特征混淆问题 | representation learning distillation |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Provable Sim-to-Real Transfer via Offline Domain Randomization | 提出离线领域随机化以解决仿真到现实转移问题 | sim-to-real domain randomization reinforcement learning |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | AtmosMJ: Revisiting Gating Mechanism for AI Weather Forecasting Beyond the Year Scale | 提出AtmosMJ以解决长时间天气预测的稳定性问题 | physically plausible |