cs.LG(2025-02-20)

📊 共 33 篇论文 | 🔗 8 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (21 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (21 篇)

#题目一句话要点标签🔗
1 Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective 提出层级稀疏率确定方法以解决大语言模型重构误差爆炸问题 large language model multimodal
2 Towards Physics-Guided Foundation Models 提出物理引导的基础模型,解决传统模型在物理可行性上的不足。 foundation model
3 Dynamic Low-Rank Sparse Adaptation for Large Language Models 提出动态低秩稀疏适配(LoSA)方法,提升稀疏大语言模型性能且不增加推理延迟。 large language model
4 Generative adversarial networks vs large language models: a comparative study on synthetic tabular data generation 提出基于GPT-4o零样本生成表格数据的框架,性能优于CTGAN。 large language model
5 Towards Efficient Automatic Self-Pruning of Large Language Models 提出自我修剪框架以高效优化大型语言模型 large language model
6 A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models 提出一种更强的低秩专家混合模型以优化基础模型微调 foundation model
7 SleepGMUformer: A gated multimodal temporal neural network for sleep staging 提出SleepGMUformer,通过门控多模态时序网络进行睡眠分期 multimodal
8 On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems 提出一种可控复杂度的FOL问题生成方法,评估LLM的逻辑推理能力 large language model
9 EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models EigenShield:利用随机矩阵理论进行因果子空间滤波,提升视觉-语言模型的对抗鲁棒性 large language model multimodal
10 UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning 提出UPCORE:一种用于平衡模型遗忘效用保持的数据选择框架 large language model
11 Quantize What Counts: More for Keys, Less for Values 提出基于几何理论的KV量化方法以优化LLM推理性能 large language model
12 Beyond the Surface: Uncovering Implicit Locations with LLMs for Personalized Local News 利用大语言模型提升本地新闻推荐的隐性位置识别 large language model
13 CER: Confidence Enhanced Reasoning in LLMs 提出CER:一种置信度增强的LLM推理框架,提升数学和开放域任务的准确性 large language model
14 FedMobile: Enabling Knowledge Contribution-aware Multi-modal Federated Learning with Incomplete Modalities FedMobile:针对模态不全的多模态联邦学习框架,提升移动感知系统鲁棒性 multimodal
15 Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery 提出基于证据理论的多源知识融合框架,加速高熵合金发现。 large language model
16 PEARL: Towards Permutation-Resilient LLMs 提出PEARL框架,提升大语言模型在上下文学习中对输入排列的鲁棒性 large language model
17 Reward Models Identify Consistency, Not Causality 奖励模型倾向于一致性而非因果性,暴露了现有奖励建模方法的局限性。 large language model
18 Challenges of Multi-Modal Coreset Selection for Depth Prediction 针对深度预测,研究多模态 Coreset 选择的挑战与局限性 multimodal
19 S*: Test Time Scaling for Code Generation 提出S*框架,通过混合测试时扩展显著提升代码生成模型的覆盖率和选择准确率。 large language model
20 InductionBench: LLMs Fail in the Simplest Complexity Class InductionBench:揭示大语言模型在最简单复杂度类上的归纳推理缺陷 large language model
21 Multi-Faceted Studies on Data Poisoning can Advance LLM Development 重新审视数据投毒:多角度研究促进大语言模型发展 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
22 Federated Fine-Tuning of Large Language Models: Kahneman-Tversky vs. Direct Preference Optimization 联邦学习中KTO优于DPO微调大型语言模型,尤其在单响应反馈场景下 DPO direct preference optimization large language model
23 ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification 提出ReVISE,通过自验证在测试时提升LLM的推理能力 reinforcement learning preference learning curriculum learning
24 PPO-MI: Efficient Black-Box Model Inversion via Proximal Policy Optimization 提出PPO-MI,通过近端策略优化实现高效黑盒模型反演攻击 reinforcement learning PPO
25 Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models 基于离线无奖励数据的潜在动力学模型规划,提升泛化性和数据效率。 reinforcement learning latent dynamics
26 Less is More: Improving LLM Alignment via Preference Data Selection 通过偏好数据选择提升LLM对齐效果,解决DPO训练中的噪声数据问题 DPO direct preference optimization large language model
27 STeCa: Step-level Trajectory Calibration for LLM Agent Learning STeCa:面向LLM Agent学习的步级轨迹校准框架 behavior cloning preference learning large language model
28 TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation TimeDistill:通过跨架构蒸馏,利用MLP实现高效长期时间序列预测 distillation
29 Reinforcement Learning with Graph Attention for Routing and Wavelength Assignment with Lightpath Reuse 提出基于图注意力网络的强化学习方法,解决光路复用下的路由和波长分配问题。 reinforcement learning
30 μRL: Discovering Transient Execution Vulnerabilities Using Reinforcement Learning 提出μRL,利用强化学习高效发现处理器瞬态执行漏洞 reinforcement learning
31 Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing 提出Llamba:通过蒸馏循环模型,实现高效语言处理 Mamba distillation
32 Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment 提出自提升DPO框架,通过构建Pareto最优响应缓解多目标对齐中的偏好冲突 DPO direct preference optimization

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
33 Adaptive Sparsified Graph Learning Framework for Vessel Behavior Anomalies 提出自适应稀疏图学习框架,用于检测船舶行为异常。 spatiotemporal TAMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页