cs.LG(2024-10-18)

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
1 FedSpaLLM: Federated Pruning of Large Language Models 提出FedSpaLLM以解决隐私敏感环境下大语言模型剪枝问题 large language model
2 Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models 提出针对RAG中检索器的后门攻击,提升Prompt注入攻击成功率 large language model
3 Revisiting Service Level Objectives and System Level Metrics in Large Language Model Serving 针对LLM服务,提出与用户体验更一致的SLO和综合性评估框架Smooth Goodput large language model
4 Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning 提出基于心电图-语言模型的元学习方法,用于少样本心电图问答 large language model multimodal
5 Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning 提出面向结构的自主推理框架SARA,提升LLM零样本复杂推理能力 large language model chain-of-thought
6 EvoPress: Accurate Dynamic Model Compression via Evolutionary Search EvoPress:通过进化搜索实现精确的动态模型压缩 large language model
7 On the Regularization of Learnable Embeddings for Time Series Forecasting 针对时间序列预测,提出正则化可学习嵌入的方法,提升模型泛化能力。 foundation model
8 Understanding the Difficulty of Low-Precision Post-Training Quantization for LLMs 揭示LLM低精度后训练量化难点:局部误差优化与全局目标不一致 large language model
9 Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media 提出EffiARA框架,通过评估标注者可靠性加权样本,提升社交媒体知识驱动的虚假信息检测性能。 large language model
10 Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks 提出面向计算笔记本的AI Agent,用于自动化错误修复 large language model
11 Investigating the Capabilities of Deep Learning for Processing and Interpreting One-Shot Multi-offset GPR Data: A Numerical Case Study for Lunar and Martian Environments 利用深度学习处理探月和探火GPR数据:一种单次多偏移数值案例研究 foundation model
12 Attuned to Change: Causal Fine-Tuning under Latent-Confounded Shifts 提出因果微调方法,解决潜在混淆变量导致的模型泛化性问题 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
13 A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning 提出CARD框架,通过动态反馈的LLM驱动奖励函数设计,提升强化学习性能。 reinforcement learning reward design large language model
14 DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents 提出DistRL,用于设备端控制代理的异步分布式强化学习框架,提升训练效率。 reinforcement learning large language model multimodal
15 Inverse Reinforcement Learning from Non-Stationary Learning Agents 提出基于Bundle Behavior Cloning的逆强化学习方法,解决非稳态学习Agent的奖励函数学习问题。 reinforcement learning behavior cloning inverse reinforcement learning
16 Streaming Deep Reinforcement Learning Finally Works 提出Stream-x算法,克服深度强化学习流式学习障碍,实现高效稳定学习 reinforcement learning deep reinforcement learning
17 How to Evaluate Reward Models for RLHF 提出Preference Proxy Evaluations (PPE),用于高效评估RLHF奖励模型。 reinforcement learning RLHF predictive model
18 Self-supervised contrastive learning performs non-linear system identification 提出动态对比学习,通过自监督学习进行非线性系统辨识。 representation learning contrastive learning
19 Online Reinforcement Learning with Passive Memory 提出利用被动记忆的在线强化学习算法,提升性能并保证近最优遗憾。 reinforcement learning
20 Graph Contrastive Learning via Cluster-refined Negative Sampling for Semi-supervised Text Classification 提出ClusterText,通过聚类优化的负采样解决图对比学习中的过聚类问题,提升半监督文本分类性能 contrastive learning
21 Transfer Reinforcement Learning in Heterogeneous Action Spaces using Subgoal Mapping 提出基于子目标映射的迁移强化学习方法,解决异构动作空间下的策略迁移问题。 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models PLMTrajRec:一种基于预训练语言模型的可扩展通用轨迹恢复方法 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页