cs.LG(2025-02-07)

📊 共 22 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (13 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (8) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (13 篇)

#题目一句话要点标签🔗
1 Can Large Language Models Understand Intermediate Representations in Compilers? 评估大语言模型对编译器中间表示的理解能力,揭示其在指令级推理上的局限性 large language model
2 Confidence Elicitation: A New Attack Vector for Large Language Models 提出信心引导攻击以提升大语言模型的对抗鲁棒性 large language model
3 Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion 提出基于预训练模型的多模态自适应融合增量学习方法,解决视听文多模态信息融合与灾难性遗忘问题。 multimodal
4 Prot2Chat: Protein LLM with Early-Fusion of Text, Sequence and Structure Prot2Chat:融合文本、序列和结构的蛋白质LLM,用于蛋白质问答 large language model multimodal
5 Unveiling the Mechanisms of Explicit CoT Training: How CoT Enhances Reasoning Generalization 揭示CoT训练机制:CoT如何增强LLM的推理泛化能力 large language model chain-of-thought
6 BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference 提出块聚类量化(BCQ)方法,实现LLM的W4A4低精度推理且精度损失小于1%。 large language model
7 Taming Latency-Memory Trade-Off in MoE-Based LLM Serving via Fine-Grained Expert Offloading FineMoE:通过细粒度专家卸载优化MoE-LLM推理的延迟-内存权衡 large language model
8 Hypencoder: Hypernetworks for Information Retrieval 提出Hypencoder,利用超网络生成查询相关的检索函数,显著提升信息检索性能。 instruction following
9 Optimizing Temperature for Language Models with Multi-Sample Inference 提出一种基于熵的无监督温度优化方法,提升LLM多样本推断性能。 large language model
10 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach 提出基于隐空间推理的循环深度语言模型,提升测试时计算能力 chain-of-thought
11 Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning 利用机器学习优化费曼积分的Integration-by-Parts归约 large language model
12 Causality can systematically address the monsters under the bench(marks) 利用因果关系系统性解决机器学习基准测试中的偏差与伪像问题 large language model
13 QuEST: Stable Training of LLMs with 1-Bit Weights and Activations QuEST:通过1比特权重和激活实现LLM的稳定训练。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
14 Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning 提出BDPO,一种行为正则化的扩散策略优化离线强化学习方法 reinforcement learning offline reinforcement learning diffusion policy
15 Seasonal Station-Keeping of Short Duration High Altitude Balloons using Deep Reinforcement Learning 利用深度强化学习实现短时高空气球的季节性定点驻留 reinforcement learning deep reinforcement learning
16 Prompt Tuning Decision Transformers with Structured and Scalable Bandits 提出基于结构化Bandit的Prompt Tuning决策Transformer,提升离线强化学习多任务泛化能力。 reinforcement learning offline reinforcement learning decision transformer
17 Graph Contrastive Learning for Connectome Classification 提出基于图对比学习的连接组分类方法,提升脑网络分析性能 representation learning contrastive learning
18 A Foundational Brain Dynamics Model via Stochastic Optimal Control 提出基于随机最优控制的脑动力学基础模型,用于脑疾病诊断与预测。 latent dynamics SSM state space model
19 Fast Adaptive Anti-Jamming Channel Access via Deep Q Learning and Coarse-Grained Spectrum Prediction 提出基于深度Q学习和粗粒度频谱预测的快速自适应抗干扰信道接入方法 reinforcement learning deep reinforcement learning DRL
20 An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks 扩展多智能体强化学习基准测试,揭示复杂合作任务中算法性能瓶颈 reinforcement learning
21 The Alpha-Alternator: Dynamic Adaptation To Varying Noise Levels In Sequences Using The Vendi Score For Improved Robustness and Performance 提出Alpha-Alternator模型,通过Vendi Score动态适应序列中变化的噪声水平,提升鲁棒性和性能。 latent dynamics Mamba

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding GST-UNet:用于时空因果推断的神经框架,解决时变混杂因素问题 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页