cs.LG(2025-01-22)

📊 共 29 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 Blockchain-based Crowdsourced Deep Reinforcement Learning as a Service 提出基于区块链的众包深度强化学习即服务框架,降低DRL应用门槛。 reinforcement learning deep reinforcement learning DRL
2 Online Preference Alignment for Language Models via Count-based Exploration 提出COPO算法,通过计数探索实现语言模型在线偏好对齐 reinforcement learning RLHF direct preference optimization
3 Deep Reinforcement Learning with Hybrid Intrinsic Reward Model 提出HIRE框架,通过混合内在奖励提升强化学习探索效率与技能学习 reinforcement learning deep reinforcement learning reward shaping
4 Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors SWIRL:一种结合时变奖励与历史依赖的逆强化学习方法,用于刻画动物行为 reinforcement learning inverse reinforcement learning
5 Adaptive Data Exploitation in Deep Reinforcement Learning 提出ADEPT框架,通过自适应数据利用提升深度强化学习的数据效率和泛化性。 reinforcement learning deep reinforcement learning
6 To Measure or Not: A Cost-Sensitive, Selective Measuring Environment for Agricultural Management Decisions with Reinforcement Learning 提出基于强化学习的成本敏感型农业管理决策环境,优化作物测量与施肥 reinforcement learning PPO
7 PPO-Based Vehicle Control for Ramp Merging Scheme Assisted by Enhanced C-V2X 提出基于PPO和增强C-V2X的匝道汇入车辆控制方案,提升安全性与效率。 reinforcement learning PPO
8 Exploring the Technology Landscape through Topic Modeling, Expert Involvement, and Reinforcement Learning 提出结合主题建模与强化学习的方法以应对技术变革挑战 reinforcement learning
9 Enhancing Multi-Attribute Fairness in Healthcare Predictive Modeling 提出一种多属性公平性优化方法,提升医疗AI预测模型的公平性 predictive model
10 Attention-Driven Hierarchical Reinforcement Learning with Particle Filtering for Source Localization in Dynamic Fields 提出基于注意力机制的分层强化学习框架,解决动态场中源定位问题 reinforcement learning
11 A Probabilistic Model for Non-Contrastive Learning 提出基于概率模型的非对比学习框架,揭示其与PCA和非对比损失的联系 contrastive learning
12 An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management 提出离线多智能体强化学习框架,用于无线资源管理,提升用户速率。 reinforcement learning
13 Bridging Text and Crystal Structures: Literature-driven Contrastive Learning for Materials Science 提出对比语言-结构预训练(CLaSP),用于材料科学中基于文本的晶体结构检索。 contrastive learning
14 HierPromptLM: A Pure PLM-based Framework for Representation Learning on Heterogeneous Text-rich Networks 提出HierPromptLM,一个纯PLM框架,用于异构富文本网络表示学习。 representation learning
15 Graph Representation Learning with Diffusion Generative Models 提出基于扩散生成模型的图表示学习方法,有效提取图结构数据的嵌入。 representation learning
16 On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration 提出MODULE算法,通过分布更新学习模仿观测,解决模仿学习中的探索不足和不稳定性问题 reinforcement learning SAC
17 GRAMA: Adaptive Graph Autoregressive Moving Average Models 提出GRAMA:一种自适应图自回归移动平均模型,用于增强图神经网络的远程依赖建模能力。 SSM state space model
18 Knowledge-Driven Federated Graph Learning on Model Heterogeneity 提出FedGKC框架,解决联邦图学习中模型异构场景下的知识迁移与聚合问题。 representation learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
19 Foundation Models for CPS-IoT: Opportunities and Challenges 针对CPS-IoT,分析并提出领域特定基础模型的发展方向与挑战 large language model foundation model multimodal
20 Multimodal AI on Wound Images and Clinical Notes for Home Patient Referral 提出DM-WAT,利用多模态AI辅助居家护理慢性伤口患者的转诊决策。 multimodal
21 Ehrenfeucht-Haussler Rank and Chain of Thought 提出基于Transformer的新型布尔函数秩的表征方法 chain-of-thought
22 GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models 提出GANQ以解决大语言模型的量化与效率问题 large language model
23 Correctness Assessment of Code Generated by Large Language Models Using Internal Representations OPENIA:利用LLM内部表征评估代码正确性,提升代码生成质量 large language model
24 IC-Cache: Efficient Large Language Model Serving via In-context Caching IC-Cache:通过上下文缓存提升大语言模型服务效率 large language model
25 SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning 提出SD-LoRA,解决基于LoRA的类增量学习中的可扩展性问题 foundation model
26 Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs 提出基于可靠性图的Pareto测试方法RG-PT,用于多目标超参数选择,兼顾可靠性和成本。 large language model
27 Multivariate Time Series Anomaly Detection by Capturing Coarse-Grained Intra- and Inter-Variate Dependencies MtsCID:通过捕获粗粒度时序依赖关系进行多元时间序列异常检测 TAMP

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
28 A Selective Homomorphic Encryption Approach for Faster Privacy-Preserving Federated Learning 提出FAS:一种选择性同态加密方法,加速隐私保护联邦学习。 OMOMO

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
29 T-Graphormer: Using Transformers for Spatiotemporal Forecasting 提出T-Graphormer,利用Transformer同时建模时空相关性,提升时空预测精度。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页