cs.LG(2025-01-31)

📊 共 36 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (20 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (12 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (20 篇)

#题目一句话要点标签🔗
1 HackerRank-ASTRA: Evaluating Correctness & Consistency of Large Language Models on cross-domain multi-file project problems HackerRank-ASTRA:评估大语言模型在跨领域多文件项目中的正确性和一致性 large language model
2 Should You Use Your Large Language Model to Explore or Exploit? 评估LLM在探索-利用权衡中的能力,发现其在探索语义化动作空间中具有潜力 large language model
3 Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models AQUA-KV:自适应键值量化,提升大语言模型KV缓存压缩率并保持精度 large language model
4 Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models 提出PIFA:一种紧凑的低秩表示方法,用于加速大语言模型推理。 large language model
5 Towards the Worst-case Robustness of Large Language Models 提出针对大语言模型的最坏情况鲁棒性评估方法 large language model
6 Symmetric Pruning of Large Language Models 提出对称剪枝理论,并结合激活与权重重要性,显著提升大语言模型剪枝效果。 large language model
7 Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations 提出神经符号表示方法,提升LLM在数学推理任务中的规则遵循能力 large language model chain-of-thought
8 Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping 优化迭代合成数据自举训练:最大化后训练阶段性能提升 large language model foundation model
9 Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning 提出CyBeR-0以解决异构联邦学习中的拜占庭攻击问题 large language model
10 Federated Sketching LoRA: A Flexible Framework for Heterogeneous Collaborative Fine-Tuning of LLMs 提出联邦草图LoRA,解决异构联邦环境下LLM高效协同微调问题 large language model
11 Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment 提出Judge Decoding,通过训练判别模块显著加速LLM推断,突破模型对齐限制。 large language model
12 Offline Learning for Combinatorial Multi-armed Bandits 提出Off-CMAB框架,解决组合多臂老虎机离线学习问题 large language model
13 Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale 提出MeZO-BCD,加速零阶优化在大语言模型微调中的应用,提升高达2.77倍。 large language model
14 TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs TeZO:利用时序维度低秩性,提升大语言模型零阶优化微调效率 large language model
15 Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities 评估LLM代码生成中软件包幻觉漏洞,揭示潜在软件供应链攻击风险 large language model
16 TabFSBench: Tabular Benchmark for Feature Shifts in Open Environments 提出TabFSBench,首个表格数据特征偏移基准,评估模型在开放环境下的泛化能力。 large language model
17 LLM Program Optimization via Retrieval Augmented Search 提出检索增强搜索(RAS)优化LLM程序,并用AEGIS提升可解释性。 large language model
18 Scaling Laws for Differentially Private Language Models 提出差分隐私语言模型的缩放规律以优化训练配置 large language model
19 Predictive Prompt Analysis 提出基于稀疏自编码器的预测性提示分析方法SPA,加速LLM提示工程。 large language model
20 Partially Rewriting a Transformer in Natural Language 提出一种部分重写Transformer的方法,旨在用自然语言解释替换网络组件,提升模型可解释性。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
21 Vintix: Action Model via In-Context Reinforcement Learning Vintix:通过上下文强化学习实现跨领域通用动作模型 reinforcement learning distillation generalist agent
22 Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning 提出了解耦软演员-评论家(DSAC)算法,提升深度强化学习的样本效率。 reinforcement learning deep reinforcement learning SAC
23 The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking 提出能量损失感知PPO算法,缓解RLHF中的奖励操纵问题 reinforcement learning PPO RLHF
24 CAAT-EHR: Cross-Attentional Autoregressive Transformer for Multimodal Electronic Health Record Embeddings CAAT-EHR:利用交叉注意力自回归Transformer生成多模态电子病历嵌入 predictive model multimodal
25 Best Policy Learning from Trajectory Preference Feedback 提出后验采样偏好学习算法以解决最佳策略识别问题 reinforcement learning policy learning preference learning
26 O-MAPL: Offline Multi-agent Preference Learning O-MAPL:离线多智能体偏好学习框架,提升合作博弈任务性能 reinforcement learning preference learning
27 BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning BRiTE:通过自举强化思维过程提升语言模型推理能力 reinforcement learning reward shaping large language model
28 Improving Multi-Label Contrastive Learning by Leveraging Label Distribution 提出基于标签分布的多标签对比学习方法,提升表征学习效果 contrastive learning
29 Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment RPO:统一的奖励感知偏好优化框架,用于模型对齐 DPO large language model
30 Reinforcement Learning on Reconfigurable Hardware: Overcoming Material Variability in Laser Material Processing 提出基于FPGA加速的强化学习激光加工控制方法,克服材料可变性 reinforcement learning
31 Optimizing Job Allocation using Reinforcement Learning with Graph Neural Networks 提出基于强化学习与图神经网络的作业分配优化方法 reinforcement learning
32 Towards Physiologically Sensible Predictions via the Rule-based Reinforcement Learning Layer 提出基于规则的强化学习层,用于修正预测模型中生理上不可能的医疗预测。 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
33 An Optimal Cascade Feature-Level Spatiotemporal Fusion Strategy for Anomaly Detection in CAN Bus 提出基于遗传算法优化的级联时空特征融合策略,用于CAN总线异常检测。 spatiotemporal
34 BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics BCAT:用于流体动力学PDE基础模型的块因果Transformer spatiotemporal foundation model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
35 Shaping Sparse Rewards in Reinforcement Learning: A Semi-supervised Approach 提出半监督强化学习方法,解决稀疏奖励下的奖励塑造问题 manipulation reinforcement learning reward shaping

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
36 Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation 提出一种时序模型,利用卫星数据生成欧洲大陆高分辨率树冠高度图,用于森林监测。 height map

⬅️ 返回 cs.LG 首页 · 🏠 返回主页