cs.LG（2025-03-13）

📊 共 33 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (16 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (13 🔗2) 支柱四：生成式动作 (Generative Motion) (2) 支柱一：机器人控制 (Robot Control) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (16 篇)

#	题目	一句话要点	标签	🔗	⭐
1	DeepSeek-Inspired Exploration of RL-based LLMs and Synergy with Wireless Networks: A Survey	探索DeepSeek启发的RL-LLM在无线网络中的应用与协同，提升网络优化与模型部署。	reinforcement learning embodied AI large language model
2	Out-of-Context Reasoning in Large Language Models	研究LLM在训练时学习的公理关系推理能力，并提出轻量级表示学习方法。	representation learning large language model
3	Enhance Exploration in Safe Reinforcement Learning with Contrastive Representation Learning	提出基于对比表示学习的安全强化学习探索增强方法	reinforcement learning representation learning contrastive learning
4	PIMRL: Physics-Informed Multi-Scale Recurrent Learning for Burst-Sampled Spatiotemporal Dynamics	PIMRL：针对突发采样时空动力学的物理信息多尺度循环学习	latent dynamics spatiotemporal
5	PluralLLM: Pluralistic Alignment in LLMs via Federated Learning	PluralLLM：通过联邦学习实现LLM中的多元化对齐	reinforcement learning preference learning RLHF
6	From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL	提出SySLLM框架，利用大语言模型实现强化学习策略的抽象文本总结	reinforcement learning spatiotemporal large language model
7	Mamba time series forecasting with uncertainty quantification	提出Mamba-ProbTSF，用于时间序列预测并量化预测不确定性	Mamba state space model	✅
8	Policy Teaching via Data Poisoning in Learning from Human Preferences	通过数据中毒攻击实现人类偏好的策略教学	reinforcement learning RLHF DPO
9	Probabilistic Forecasting via Autoregressive Flow Matching	提出FlowTime，一种基于自回归Flow Matching的概率时间序列预测模型	flow matching
10	Collaborative Speculative Inference for Efficient LLM Inference Serving	提出CoSine，通过协同推测加速LLM推理服务，提升资源利用率和吞吐量。	SSM large language model
11	TacticExpert: Spatial-Temporal Graph Language Model for Basketball Tactics	TacticExpert：提出时空图语言模型，用于篮球战术建模与预测。	contrastive learning large language model
12	Accuracy of Discretely Sampled Stochastic Policies in Continuous-time Reinforcement Learning	针对连续时间强化学习，提出离散采样随机策略的精度分析框架	reinforcement learning
13	Inter-environmental world modeling for continuous and compositional dynamics	提出基于李群作用的世界建模方法，用于连续组合动态环境下的通用智能体控制。	world model
14	Fixed-Point RNNs: Interpolating from Diagonal to Dense	提出基于定点RNN的序列建模方法，在效率和表达性之间取得平衡	Mamba SSM
15	SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process	提出SortingEnv，用于优化工业分拣系统并研究智能体在演化环境中的行为。	reinforcement learning PPO
16	Towards Constraint-Based Adaptive Hypergraph Learning for Solving Vehicle Routing: An End-to-End Solution	提出基于约束的自适应超图学习框架，端到端解决车辆路径问题	reinforcement learning representation learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
17	Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation	Panopticon：提出一种用于地球观测的任意传感器通用模型，显著提升性能。	foundation model
18	Evaluating Mathematical Reasoning Across Large Language Models: A Fine-Grained Approach	系统评估大型语言模型数学推理能力，揭示模型架构与性能关联	large language model
19	BeamLLM: Vision-Empowered mmWave Beam Prediction with Large Language Models	提出BeamLLM，利用视觉增强的大语言模型进行毫米波波束预测，解决高训练开销和延迟问题。	large language model
20	Numerical Error Analysis of Large Language Models	分析大语言模型中的数值误差，提出缓解策略以提升训练稳定性	large language model
21	Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout	提出DropPEFT，通过层Dropout高效联邦微调大型语言模型	large language model
22	Robustness Tokens: Towards Adversarial Robustness of Transformers	提出Robustness Tokens，提升Transformer模型对抗攻击的鲁棒性	foundation model
23	MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance	MentalChat16K：用于对话式心理健康辅助的基准数据集	large language model	✅
24	ASIDE: Architectural Separation of Instructions and Data in Language Models	ASIDE：通过指令与数据架构分离增强语言模型的安全性	large language model	✅
25	DP-GPL: Differentially Private Graph Prompt Learning	提出DP-GPL，解决图提示学习中的隐私泄露问题，实现差分隐私图提示生成。	foundation model
26	Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression	提出Generative Prediction Sets (GPS)算法，为深度生成模型生成具有验证保证的预测集合。	large language model
27	Capturing Semantic Flow of ML-based Systems	提出语义流，用于捕获和分析基于机器学习系统的内部行为	large language model
28	Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores	Samoyeds：利用稀疏张量核心加速双侧结构化稀疏MoE模型	large language model
29	From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs	利用大型语言模型揭示偏微分方程中的符号结构，提升求解效率与精度	large language model

🔬 支柱四：生成式动作 (Generative Motion) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
30	BioSerenity-E1: a self-supervised EEG model for medical applications	BioSerenity-E1：用于医疗应用的自监督脑电模型，实现多个诊断任务的SOTA性能。	VQ-VAE spatiotemporal foundation model
31	Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion	提出加速滚动扩散的流式手势生成框架，实现实时协同语音手势生成	motion synthesis

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
32	MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents	提出恶意图像补丁（MIP）攻击，劫持多模态操作系统代理。	manipulation multimodal

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing	提出KVTP：面向关键帧的视觉Token剪枝，提升大模型长视频理解效率	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页