cs.LG(2024-06-27)

📊 共 19 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
1 MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation MissionGNN:基于层级多模态GNN与任务知识图谱的弱监督视频异常识别 large language model multimodal
2 LICO: Large Language Models for In-Context Molecular Optimization LICO:基于大语言模型的分子优化上下文学习框架 large language model
3 A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE) No-IDLE原型系统:探索交互式深度学习在非专家用户中的应用 large language model multimodal
4 The Remarkable Robustness of LLMs: Stages of Inference? LLM推理阶段鲁棒性研究:通过层删除与交换揭示模型内部运作机制 large language model
5 Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks 提出Granite-20B-FUNCTIONCALLING,通过多任务学习提升LLM函数调用能力。 large language model
6 Jump Starting Bandits with LLM-Generated Prior Knowledge 利用LLM生成先验知识,加速Contextual Bandit算法学习 large language model
7 From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data 通过合成数据微调提升LLM在长文本中的信息检索能力 large language model
8 Towards Learning Abductive Reasoning using VSA Distributed Representations 提出ARLC模型,利用VSA分布式表示学习归纳推理,解决抽象推理任务。 large language model
9 A Teacher Is Worth A Million Instructions 利用大模型知识蒸馏与领域对齐,提升小模型指令调优性能 large language model
10 UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI 揭示大语言模型内容监管困境:仅靠遗忘不足以抵御上下文学习带来的知识重引入 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
11 OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents OmniJARVIS:统一视觉-语言-动作 Token 化实现开放世界指令跟随智能体 imitation learning vision-language-action VLA
12 From Efficient Multimodal Models to World Models: A Survey 综述多模态大模型:迈向通用人工智能与世界模型的关键技术与挑战 world model large language model multimodal
13 Curriculum Learning with Quality-Driven Data Selection 提出基于质量驱动数据选择的课程学习方法,提升多模态大语言模型性能 curriculum learning large language model multimodal
14 Efficient World Models with Context-Aware Tokenization 提出Δ-IRIS,通过上下文感知 Tokenization 实现高效世界模型,刷新 Crafter 基准。 reinforcement learning deep reinforcement learning world model
15 Averaging log-likelihoods in direct alignment 提出一种长度不变的直接对齐方法,优化LLM与人类判断的一致性。 reinforcement learning RLHF large language model
16 Instance Temperature Knowledge Distillation 提出基于强化学习的实例温度知识蒸馏方法,提升学生网络性能。 reinforcement learning distillation
17 Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers GCFormer:利用对比学习增强Token化图Transformer中的节点表示,提升节点分类性能。 contrastive learning
18 Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion 提出对比策略梯度(CoPG),用于在序列级奖励下对齐LLM,且兼容监督学习。 reinforcement learning large language model
19 Decoding-Time Language Model Alignment with Multiple Objectives 提出多目标解码(MOD)算法,用于解码时对齐语言模型以优化多个目标。 PPO DPO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页