cs.LG(2025-01-16)
📊 共 13 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (6 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (6)
支柱七:动作重定向 (Motion Retargeting) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | 提出因果奖励建模方法,提升大型语言模型对齐的可靠性和公平性 | reinforcement learning RLHF large language model | ||
| 2 | Enhancing Generalization in Chain of Thought Reasoning for Smaller Models | 提出PRADA框架,提升小模型在思维链推理中的泛化能力 | distillation chain-of-thought | ||
| 3 | Optimization Strategies for Enhancing Resource Efficiency in Transformers & Large Language Models | 针对Transformer与LLM,提出优化策略以提升资源效率 | distillation large language model | ||
| 4 | From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation | 提出基于Shapley值的模型解释方法,提升强化学习策略的可解释性。 | reinforcement learning deep reinforcement learning | ||
| 5 | Class Incremental Fault Diagnosis under Limited Fault Data via Supervised Contrastive Knowledge Distillation | 提出SCLIFD框架,解决少样本下类别增量故障诊断中的灾难性遗忘和类别不平衡问题 | representation learning distillation | ✅ | |
| 6 | Clone-Robust AI Alignment | 提出加权MLE算法,增强RLHF在非均匀数据集下的克隆鲁棒性 | reinforcement learning RLHF large language model |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | U-Fair: Uncertainty-based Multimodal Multitask Learning for Fairer Depression Detection | 提出基于不确定性的多模态多任务学习框架U-Fair,用于更公平的抑郁症检测。 | multimodal | ||
| 8 | Large Language Model is Secretly a Protein Sequence Optimizer | 利用大语言模型进行蛋白质序列优化,实现定向进化 | large language model | ||
| 9 | An LLM-Guided Tutoring System for Social Skills Training | 提出一种LLM引导的辅导系统,用于动态生成社交技能训练场景。 | large language model | ||
| 10 | Cueless EEG imagined speech for subject identification: dataset and benchmarks | 提出无提示脑电想象语音范式,用于安全可靠的个体身份识别 | foundation model | ||
| 11 | Rational Tuning of LLM Cascades via Probabilistic Modeling | 提出概率模型以优化LLM级联的置信度阈值 | large language model | ||
| 12 | Confidence Estimation for Error Detection in Text-to-SQL Systems | 针对Text-to-SQL系统,提出基于熵的置信度估计方法以进行错误检测。 | large language model |
🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | Predicting Air Temperature from Volumetric Urban Morphology with Machine Learning | 提出一种基于体素化城市形态和机器学习的城市气温预测方法,辅助城市规划。 | spatial relationship |