cs.LG(2026-03-09)

📊 共 33 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (18 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (1) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (18 篇)

#题目一句话要点标签🔗
1 DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models DyQ-VLA:面向具身视觉-语言-动作模型的时间动态感知量化方法 vision-language-action VLA
2 Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules 利用Proper Scoring Rules评估表格数据PFN的概率预测,提升分布回归性能 foundation model
3 Deterministic Differentiable Structured Pruning for Large Language Models 提出确定性可微结构化剪枝(DDP),用于高效压缩大型语言模型。 large language model
4 Impermanent: A Live Benchmark for Temporal Generalization in Time Series Forecasting 提出Impermanent基准以解决时间序列预测中的通用性评估问题 foundation model
5 Efficient Credal Prediction through Decalibration 提出基于解校准的高效可信预测方法,适用于复杂模型的不确定性量化 foundation model
6 LycheeCluster: Efficient Long-Context Inference with Structure-Aware Chunking and Hierarchical KV Indexing LycheeCluster:通过结构感知分块和分层KV索引实现高效长文本推理 large language model
7 Fibration Policy Optimization 提出Fibration Policy Optimization,用于大规模语言模型多尺度分层策略优化。 large language model
8 SERQ: Saliency-Aware Low-Rank Error Reconstruction for LLM Quantization SERQ:面向LLM量化的显著性感知低秩误差重构方法 large language model
9 AutoAdapt: An Automated Domain Adaptation Framework for LLMs AutoAdapt:一种面向LLM的自动化领域自适应框架,提升专业领域性能。 large language model
10 Invisible Safety Threat: Malicious Finetuning for LLM via Steganography 提出一种基于隐写术的恶意微调方法,使LLM在表面安全下秘密生成有害内容。 large language model
11 EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs EAGLE-Pangu:昇腾NPU上加速器安全的树状推测解码 large language model
12 Tiny Autoregressive Recursive Models 探索自回归模型中的递归机制:对Tiny递归模型在自回归任务中的有效性进行评估 foundation model
13 Stabilized Fine-Tuning with LoRA in Federated Learning: Mitigating the Side Effect of Client Size and Rank via the Scaling Factor 提出SFed-LoRA,通过自适应缩放因子解决联邦学习中LoRA微调的不稳定性问题。 large language model
14 Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization 提出容量感知混合律CAMEL,高效优化LLM数据配比并提升性能 large language model
15 FedMomentum: Preserving LoRA Training Momentum in Federated Fine-Tuning FedMomentum:联邦微调中保留LoRA训练动量的框架 large language model
16 ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework ELLMob:基于自对齐LLM框架的事件驱动型人类移动模式生成 large language model
17 LeJOT-AutoML: LLM-Driven Feature Engineering for Job Execution Time Prediction in Databricks Cost Optimization LeJOT-AutoML:基于LLM的特征工程,优化Databricks作业执行时间预测 large language model
18 Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference 提出基于粒子滤波的语言模型并行推理框架,优化采样效率并分析其理论极限。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
19 MJ1: Multimodal Judgment via Grounded Verification 提出MJ1以解决多模态判断中的视觉证据基础问题 reinforcement learning multimodal visual grounding
20 Data-Driven Priors for Uncertainty-Aware Deterioration Risk Prediction with Multimodal Data MedCertAIn:利用多模态数据和数据驱动先验提升风险预测的可靠性 predictive model multimodal
21 Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting 提出ROMI,通过鲁棒价值感知模型学习和隐式可微自适应加权,提升离线强化学习性能。 reinforcement learning offline RL offline reinforcement learning
22 A Recipe for Stable Offline Multi-agent Reinforcement Learning 提出尺度不变值归一化(SVN)方法,解决离线多智能体强化学习中非线性值分解的不稳定性问题。 reinforcement learning offline reinforcement learning
23 Impact of Connectivity on Laplacian Representations in Reinforcement Learning 基于拉普拉斯特征的强化学习:连通性对状态表示的影响 reinforcement learning representation learning
24 Learning Hierarchical Knowledge in Text-Rich Networks with Taxonomy-Informed Representation Learning 提出TIER,通过层级分类知识学习增强文本富网络表示。 representation learning contrastive learning
25 Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning 提出基于多层蒙特卡洛的自然策略梯度算法,解决凹标量化多目标强化学习中的偏差问题。 reinforcement learning
26 Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck 提出基于条件信息瓶颈的预算强制方法,提升LLM推理效率与精度。 reinforcement learning chain-of-thought
27 Posterior Sampling Reinforcement Learning with Gaussian Processes for Continuous Control: Sublinear Regret Bounds for Unbounded State Spaces 提出基于高斯过程的后验采样强化学习以解决连续控制问题 reinforcement learning
28 Meta-RL with Shared Representations Enables Fast Adaptation in Energy Systems 提出基于共享表征的元强化学习框架,加速能源系统自适应控制。 reinforcement learning representation learning
29 DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding DARC:通过风险约束解码实现对齐,解决偏好对齐中异质性偏好问题 RLHF DPO

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
30 Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control 提出S2AC和SDAC两种流式深度强化学习算法,适用于资源受限设备的在线微调。 sim2real reinforcement learning deep reinforcement learning
31 SYNAPSE: Framework for Neuron Analysis and Perturbation in Sequence Encoding SYNAPSE:用于序列编码中神经元分析和扰动的框架 manipulation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
32 C$^2$FG: Control Classifier-Free Guidance via Score Discrepancy Analysis 提出C$^2$FG以解决现有分类器无关引导方法的局限性 classifier-free guidance

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
33 Context-free Self-Conditioned GAN for Trajectory Forecasting 提出基于自条件GAN的无上下文轨迹预测方法,提升运动模式学习能力 human motion

⬅️ 返回 cs.LG 首页 · 🏠 返回主页