cs.LG(2025-12-31)

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (13 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (13 篇)

#题目一句话要点标签🔗
1 Self-Supervised Neural Architecture Search for Multimodal Deep Neural Networks 提出一种自监督多模态神经网络架构搜索方法,解决标注数据依赖问题。 multimodal
2 PriceSeer: Evaluating Large Language Models in Real-Time Stock Prediction PriceSeer:一个用于实时股票预测中评估大型语言模型的动态基准 large language model
3 Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery NeuroSymBO:动态贝叶斯优化指令调优偏微分方程发现 large language model
4 Diffusion Language Models are Provably Optimal Parallel Samplers 提出基于思维链的扩散语言模型,实现最优并行采样并提升空间复杂度。 chain-of-thought
5 Efficiently Estimating Data Efficiency for Language Model Fine-tuning 提出基于梯度余弦相似性的数据效率预测方法,减少LLM微调的标注成本。 large language model
6 When to Ponder: Adaptive Compute Allocation for Code Generation via Test-Time Training 提出PonderTTT,通过测试时训练自适应分配代码生成计算资源。 large language model
7 Characterization of Transfer Using Multi-task Learning Curves 提出基于多任务学习曲线的迁移学习表征方法,用于评估不同数据规模下的迁移效果。 foundation model
8 Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback 提出无正则化的OMWU算法,解决偏好反馈零和博弈中的线性收敛问题 large language model
9 FPGA Co-Design for Efficient N:M Sparse and Quantized Model Inference 提出基于FPGA的软硬件协同设计框架,加速稀疏量化大语言模型推理。 large language model
10 BandiK: Efficient Multi-Task Decomposition Using a Multi-Bandit Framework BandiK:利用多臂老虎机框架实现高效的多任务分解 foundation model
11 Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space 提出动态大概念模型(DLCM),通过自适应语义空间中的潜在推理提升LLM效率。 large language model
12 MultiRisk: Multiple Risk Control via Iterative Score Thresholding 提出MultiRisk算法,通过迭代阈值处理实现生成式AI多重风险约束控制。 large language model
13 More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization 提出多包络双二值分解(MDBF),用于大语言模型极低比特量化,提升精度。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
14 Dichotomous Diffusion Policy Optimization 提出DIPOLE:一种用于稳定和可控扩散策略优化的强化学习算法。 reinforcement learning diffusion policy vision-language-action
15 Sparse Offline Reinforcement Learning with Corruption Robustness 提出基于稀疏鲁棒估计的Actor-Critic算法,解决离线稀疏RL中的数据污染问题。 reinforcement learning offline RL offline reinforcement learning
16 From Perception to Punchline: Empowering VLM with the Art of In-the-wild Meme 提出HUMOR框架,赋能VLM生成更幽默、符合人类偏好的野生表情包 reinforcement learning HuMoR multimodal
17 GRL-SNAM: Geometric Reinforcement Learning with Path Differential Hamiltonians for Simultaneous Navigation and Mapping in Unknown Environments 提出GRL-SNAM,通过几何强化学习与路径微分哈密顿量实现未知环境下的同步定位与建图。 reinforcement learning policy learning
18 Many Minds from One Model: Bayesian-Inspired Transformers for Population Diversity 提出Population Bayesian Transformers,从单一预训练LLM中采样多样化模型实例,提升生成多样性。 reinforcement learning large language model
19 ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning ResponseRank:通过偏好强度学习实现数据高效的奖励建模 reinforcement learning preference learning RLHF
20 Attribution-Guided Distillation of Matryoshka Sparse Autoencoders 提出DMSAE,通过归因引导蒸馏Matryoshka稀疏自编码器,提升特征一致性和可迁移性。 distillation
21 Robust Bayesian Dynamic Programming for On-policy Risk-sensitive Reinforcement Learning 提出鲁棒贝叶斯动态规划,用于解决策略风险敏感强化学习中的转移不确定性问题 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 Learning Temporally Consistent Turbulence Between Sparse Snapshots via Diffusion Models 提出基于扩散模型的稀疏快照间湍流重建方法 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页