cs.LG(2025-02-21)

📊 共 25 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (14 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱八:物理动画 (Physics-based Animation) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
1 The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer 研究表明:大语言模型推理能力提升并非依赖更长推理链,而是更高效的推理 large language model chain-of-thought
2 Directional Gradient Projection for Robust Fine-Tuning of Foundation Models 提出方向梯度投影(DiGraP)方法,用于稳健微调预训练模型。 foundation model
3 Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation 提出SymTime:一种基于序列-符号双模态数据生成的时序分析预训练模型,缓解数据稀缺问题。 foundation model
4 Single-pass Detection of Jailbreaking Input in Large Language Models 提出单次前向检测方法SPD,高效防御大语言模型越狱攻击 large language model
5 IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector 提出IPAD:一种鲁棒且可解释的LLM生成文本检测器,解决现有检测器泛化性差的问题。 large language model
6 A Cautionary Tale About "Neutrally" Informative AI Tools Ahead of the 2025 Federal Elections in Germany 警惕!AI工具在德国2025年联邦选举中可能存在政治偏见和虚假信息风险 large language model
7 Steering LLMs for Formal Theorem Proving 提出激活引导方法,提升LLM在形式化定理证明中的性能 large language model
8 R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-Task Learning R-LoRA:面向高效多任务学习的随机多头LoRA large language model
9 Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning Fed-SB:一种用于(私有)联邦LoRA微调的高效通信和高性能银弹方案 foundation model
10 Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs 系统性评估针对LLM提示注入攻击的防御机制,揭示现有防御的局限性。 large language model
11 CoKV: Optimizing KV Cache Allocation via Cooperative Game CoKV:通过合作博弈优化KV缓存分配,提升LLM长文本处理能力 large language model
12 SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention SVDq:一种基于SVD的混合精度量化方法,实现LLM Attention中KV缓存的超高压缩率。 large language model
13 Improving Value-based Process Verifier via Structural Prior Injection 通过结构先验注入改进基于价值的过程验证器 large language model
14 Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs Auto-Bench:用于评估LLM在科学发现中能力的新型自动化基准 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
15 The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning 综述LLM/VLM在强化学习中的应用,解决知识缺乏、长程规划和奖励设计等挑战 reinforcement learning reward design large language model
16 Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification Mantis:轻量级校准时间序列分类基础模型,提升用户友好性 contrastive learning foundation model
17 Hyperspherical Normalization for Scalable Deep Reinforcement Learning SimbaV2通过超球面归一化和奖励缩放,提升深度强化学习在大模型上的可扩展性和稳定性。 reinforcement learning deep reinforcement learning
18 SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning SALSA-RL:基于动作潜在空间稳定性的强化学习方法,提升可解释性。 reinforcement learning deep reinforcement learning DRL
19 Enhancing PPO with Trajectory-Aware Hybrid Policies 提出HP3O算法,利用轨迹回放缓存增强PPO,提升强化学习性能 reinforcement learning PPO
20 Towards a Reward-Free Reinforcement Learning Framework for Vehicle Control 提出一种免奖励强化学习框架,用于解决车辆控制中人工奖励设计偏差问题。 reinforcement learning imitation learning
21 SpikeRL: A Scalable and Energy-efficient Framework for Deep Spiking Reinforcement Learning SpikeRL:一种可扩展且节能的深度脉冲强化学习框架,用于复杂连续控制任务。 reinforcement learning deep reinforcement learning
22 Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF 提出投影优化框架,高效解决多目标和多群体RLHF问题 reinforcement learning RLHF
23 Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors 提出基于数据依赖高斯混合先验的表征学习泛化保证方法 representation learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
24 Spatiotemporal Forecasting in Climate Data Using EOFs and Machine Learning Models: A Case Study in Chile 提出一种基于EOFs分解和机器学习的混合方法,用于智利气候数据的时空预测。 spatiotemporal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
25 A Data-Driven Real-Time Optimal Power Flow Algorithm Using Local Feedback 提出一种基于局部反馈的数据驱动实时最优潮流算法,适用于高比例分布式能源网络。 penetration

⬅️ 返回 cs.LG 首页 · 🏠 返回主页