cs.LG(2025-05-12)

📊 共 30 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱八:物理动画 (Physics-based Animation) (4) 支柱五:交互与反应 (Interaction & Reaction) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains 提出缓存高效的后验采样框架,加速LLM先验强化学习在离散和连续域的应用 reinforcement learning offline RL CQL
2 RLSR: Reinforcement Learning from Self Reward 提出RLSR:利用自奖励的强化学习,提升LLM在复杂问题求解中的能力。 reinforcement learning large language model
3 Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review 综述:结合贝叶斯推断与强化学习的智能体决策方法 reinforcement learning policy learning model-based RL
4 Simple yet Effective Semi-supervised Knowledge Distillation from Vision-Language Models via Dual-Head Optimization 提出双头优化(DHO),通过视觉-语言模型的知识蒸馏实现高效半监督学习 distillation
5 An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits 仅需额外RMSNorm即可微调至1.58比特量化精度 distillation large language model
6 A Theoretical Framework for Explaining Reinforcement Learning with Shapley Values 提出SVERL框架,利用Shapley值解释强化学习智能体的行为、结果和预测。 reinforcement learning
7 MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering MLE-Dojo:交互式环境赋能LLM智能体进行机器学习工程 reinforcement learning large language model
8 Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems 提出基于Transformer的自监督对比学习入侵检测系统,提升泛化能力。 contrastive learning
9 EAGLE: Contrastive Learning for Efficient Graph Anomaly Detection EAGLE:基于对比学习的高效图异常检测模型,适用于异构图。 contrastive learning
10 Online Episodic Convex Reinforcement Learning 提出在线情景凸强化学习算法,解决具有凸目标函数的MDP在线学习问题 reinforcement learning
11 INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning INTELLECT-2:通过全球分布式强化学习训练的320亿参数推理模型 reinforcement learning
12 REMEDI: Relative Feature Enhanced Meta-Learning with Distillation for Imbalanced Prediction REMEDI:结合相对特征增强的元学习与蒸馏,解决极度不平衡预测问题 distillation
13 Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks 提出HGNN-IMA模型,通过模态互影响学习提升多模异构网络节点分类性能 representation learning
14 VoI-Driven Joint Optimization of Control and Communication in Vehicular Digital Twin Network 提出基于信息价值驱动的车辆数字孪生网络控制与通信联合优化框架 reinforcement learning deep reinforcement learning DRL

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
15 Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks 提出基于多模态大语言模型和Kolmogorov Arnold网络的符号回归方法 large language model multimodal
16 Multimodal Cancer Modeling in the Age of Foundation Model Embeddings 提出基于Foundation Model嵌入的多模态癌症建模方法,提升癌症生存预测性能。 foundation model multimodal
17 Assessing the Chemical Intelligence of Large Language Models ChemIQ:评估大型语言模型在有机化学推理能力的新基准 large language model
18 SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models SpecRouter:面向大语言模型多级推测解码的自适应路由框架 large language model
19 Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models 提出直接密度比优化(DDRO)方法,实现大语言模型与人类偏好更可靠的对齐 large language model
20 Injecting Knowledge Graphs into Large Language Models 提出一种将知识图谱注入大语言模型的方法,提升符号推理能力。 large language model
21 Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders 提出梯度稀疏自编码器(GradSAE),通过梯度信息识别大语言模型中具有影响力的隐变量。 large language model
22 TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining TACOS:用于语言-音频预训练的时序对齐音频字幕数据集 large language model
23 LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning LEAD:一种高效的LLM指令调优迭代数据选择框架,无需额外模型推理。 large language model
24 Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection 提出LLM不确定性分解框架,实现任务自适应的模型与指标选择,提升可靠性。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (4 篇)

#题目一句话要点标签🔗
25 The Geography of Transportation Cybersecurity: Visitor Flows, Industry Clusters, and Spatial Dynamics 提出BiTransGCN框架,预测交通网络安全产业集群的访客流量和空间动态。 spatiotemporal
26 Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning 提出基于自交叉特征的脉冲神经网络,用于高效小样本学习 spatiotemporal
27 Joint Graph Convolution and Sequential Modeling for Scalable Network Traffic Estimation 提出基于图卷积和序列建模的交通流量预测方法,提升复杂网络环境下的预测精度。 spatiotemporal
28 EnvCDiff: Joint Refinement of Environmental Information and Channel Fingerprints via Conditional Generative Diffusion Model EnvCDiff:利用条件生成扩散模型联合优化环境信息和信道指纹 diff-sim

🔬 支柱五:交互与反应 (Interaction & Reaction) (2 篇)

#题目一句话要点标签🔗
29 Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption 提出基于同态加密的私有LoRA微调方案,保护LLM训练数据隐私 OMOMO large language model
30 Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting 提出潜变量行为扩散模型,用于生成对话场景中更自然的面部反应 reaction synthesis

⬅️ 返回 cs.LG 首页 · 🏠 返回主页