cs.LG(2025-12-31)

📊 共 17 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
1 Self-Supervised Neural Architecture Search for Multimodal Deep Neural Networks 提出一种自监督多模态神经网络架构搜索方法,解决标注数据依赖问题。 multimodal
2 Diffusion Language Models are Provably Optimal Parallel Samplers 提出基于思维链的扩散语言模型,实现最优并行采样,并论证修订机制的必要性。 chain-of-thought
3 Efficiently Estimating Data Efficiency for Language Model Fine-tuning 提出基于梯度余弦相似性的数据效率预测方法,减少LLM微调的标注成本。 large language model
4 Characterization of Transfer Using Multi-task Learning Curves 提出基于多任务学习曲线的迁移学习表征方法,用于刻画数据集扰动下的迁移效应。 foundation model
5 Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback 提出无正则化的OMWU算法,解决偏好反馈零和博弈中的线性收敛问题 large language model
6 FPGA Co-Design for Efficient N:M Sparse and Quantized Model Inference 提出基于FPGA的软硬件协同设计框架,加速稀疏量化大语言模型推理。 large language model
7 BandiK: Efficient Multi-Task Decomposition Using a Multi-Bandit Framework 提出BandiK以解决多任务学习中的辅助任务选择问题 foundation model
8 Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space 提出动态大概念模型(DLCM),通过自适应语义空间中的潜在推理提升LLM效率。 large language model
9 MultiRisk: Multiple Risk Control via Iterative Score Thresholding 提出MultiRisk算法,通过迭代阈值处理实现生成式AI系统多重风险控制。 large language model
10 More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization 提出多包络双重二值分解(MDBF),用于大语言模型极低比特量化,提升精度。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
11 Sparse Offline Reinforcement Learning with Corruption Robustness 提出基于稀疏鲁棒估计的Actor-Critic算法,解决离线稀疏RL中的数据污染问题。 reinforcement learning offline RL offline reinforcement learning
12 From Perception to Punchline: Empowering VLM with the Art of In-the-wild Meme 提出HUMOR框架,赋能VLM生成更幽默、符合人类偏好的野生表情包 reinforcement learning HuMoR multimodal
13 Many Minds from One Model: Bayesian Transformers for Population Intelligence 提出Population Bayesian Transformers,提升Transformer模型的多样性和决策能力 reinforcement learning large language model
14 ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning ResponseRank:通过偏好强度学习实现数据高效的奖励建模 reinforcement learning preference learning RLHF
15 Attribution-Guided Distillation of Matryoshka Sparse Autoencoders 提出DMSAE,通过归因引导蒸馏Matryoshka稀疏自编码器,提升特征一致性和可迁移性。 distillation
16 Robust Bayesian Dynamic Programming for On-policy Risk-sensitive Reinforcement Learning 提出鲁棒贝叶斯动态规划,用于解决策略风险敏感强化学习中的转移不确定性问题 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
17 Learning Temporally Consistent Turbulence Between Sparse Snapshots via Diffusion Models 提出基于条件扩散模型的时序一致湍流插值方法,用于稀疏快照间的湍流重建。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页