cs.LG(2025-12-18)

📊 共 19 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (11) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)

#题目一句话要点标签🔗
1 Pretrained Battery Transformer (PBT): A battery life prediction foundation model 提出预训练电池Transformer(PBT),用于电池寿命预测,显著提升泛化性能。 foundation model
2 Coarse-to-Fine Open-Set Graph Node Classification with Large Language Models 提出CFC框架,利用大语言模型实现图节点开放集分类与细粒度OOD识别。 large language model
3 A Multimodal Approach to Alzheimer's Diagnosis: Geometric Insights from Cube Copying and Cognitive Assessments 提出基于图神经网络的多模态融合方法,用于阿尔茨海默病早期诊断。 multimodal
4 Impacts of Racial Bias in Historical Training Data for News AI 揭示新闻AI中历史数据偏见:以纽约时报语料库为例,分析种族标签的影响。 large language model
5 DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI DataFlow:一个LLM驱动的统一数据准备与工作流自动化框架 large language model
6 Muon is Provably Faster with Momentum Variance Reduction 通过动量方差减少提升Muon优化器性能 large language model
7 A Systematic Study of Code Obfuscation Against LLM-based Vulnerability Detection 系统性研究代码混淆对基于LLM的漏洞检测的影响,揭示其性能变化规律 large language model
8 Feature-Selective Representation Misdirection for Machine Unlearning 提出选择性表征误导(SRMU)框架,解决LLM中知识遗忘难题,兼顾安全与效用。 large language model
9 In-Context Probing for Membership Inference in Fine-Tuned Language Models 提出ICP-MIA框架,通过上下文探查解决微调语言模型的成员推理攻击问题。 large language model
10 CKA-Guided Modular Quantization: Beyond Bit-Width to Algorithmic Diversity 提出CKA引导的模块化量化方法,实现大模型层间算法多样性量化 large language model
11 Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference 提出交错批调度(SBS),优化DP+EP架构下LLM推理的首Token延迟和吞吐量。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
12 Non-Asymptotic Global Convergence of PPO-Clip 提出PPO-Clip算法的非渐近全局收敛性分析 reinforcement learning PPO RLHF
13 Stackelberg Learning from Human Feedback: Preference Optimization as a Sequential Game 提出Stackelberg Learning from Human Feedback (SLHF)框架,用于偏好优化。 reinforcement learning RLHF large language model
14 Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward 通过裁剪、熵和虚假奖励重新思考RLVR,提升LLM推理能力 reinforcement learning large language model
15 Meta-RL Induces Exploration in Language Agents LaMer:基于元强化学习提升语言Agent在复杂环境中的探索能力 reinforcement learning large language model
16 On The Hidden Biases of Flow Matching Samplers 揭示Flow Matching采样器中的隐藏偏差,分析其能量次优性 flow matching
17 NDRL: Cotton Irrigation and Nitrogen Application with Nested Dual-Agent Reinforcement Learning 提出NDRL方法以解决棉花灌溉与氮肥施用的复杂性问题 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
18 AIMM: An AI-Driven Multimodal Framework for Detecting Social-Media-Influenced Stock Market Manipulation AIMM:用于检测社交媒体影响的股票市场操纵的人工智能驱动多模态框架 manipulation multimodal
19 Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning 提出后验行为克隆(PostBC)方法,提升RL微调的预训练策略效果 manipulation reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页