| 1 |
Massively Multimodal Foundation Models: A Framework for Capturing Dependencies with Specialized Mixture-of-Experts |
提出基于专家混合模型的大规模多模态框架,利用时序依赖指导路由。 |
foundation model multimodal |
|
|
| 2 |
DecepChain: Inducing Deceptive Reasoning in Large Language Models |
DecepChain:诱导大语言模型产生具有欺骗性的推理链 |
large language model chain-of-thought |
✅ |
|
| 3 |
MultiFair: Multimodal Balanced Fairness-Aware Medical Classification with Dual-Level Gradient Modulation |
提出MultiFair,通过双层梯度调制实现多模态医学分类的平衡公平性。 |
multimodal |
|
|
| 4 |
Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models |
提出FreeDave算法,实现扩散大语言模型无损并行解码加速。 |
large language model |
|
|
| 5 |
Large Language Models Inference Engines based on Spiking Neural Networks |
提出NeurTransformer,一种基于脉冲神经网络的大语言模型推理引擎设计方法。 |
large language model |
|
|
| 6 |
AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond |
AccidentBench:构建大规模多模态基准,评估车辆事故及其他安全场景下的理解与推理能力 |
multimodal |
✅ |
|
| 7 |
Memory-Driven Self-Improvement for Decision Making with Large Language Models |
提出基于记忆驱动的自提升框架,提升LLM在序贯决策任务中的性能 |
large language model |
|
|
| 8 |
NeuroTTT: Bridging Pretraining-Downstream Task Misalignment in EEG Foundation Models via Test-Time Training |
提出NeuroTTT,通过测试时训练桥接脑电图预训练模型与下游任务的错位问题 |
foundation model |
✅ |
|
| 9 |
MIDAS: Misalignment-based Data Augmentation Strategy for Imbalanced Multimodal Learning |
提出MIDAS,通过不一致数据增强解决多模态不平衡学习问题 |
multimodal |
|
|
| 10 |
Kairos: Towards Adaptive and Generalizable Time Series Foundation Models |
Kairos:面向自适应和泛化时间序列的动态基础模型 |
foundation model |
✅ |
|
| 11 |
Layer-wise dynamic rank for compressing large language models |
提出D-Rank:一种层间动态秩分配的LLM压缩框架,提升压缩性能。 |
large language model |
|
|
| 12 |
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training |
揭示LLM视觉先验:通过语言预训练学习视觉感知与推理能力 |
large language model multimodal |
|
|
| 13 |
ACT: Agentic Classification Tree |
提出Agentic Classification Tree (ACT),利用LLM为非结构化数据构建可解释决策树。 |
large language model chain-of-thought |
|
|
| 14 |
Attribution-Guided Decoding |
提出基于归因引导的解码方法(AGD),提升LLM指令遵循和知识准确性。 |
large language model instruction following |
|
|
| 15 |
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking |
提出Expert Merging,通过无监督专家对齐和重要性引导的分层分块实现模型合并。 |
large language model multimodal |
✅ |
|
| 16 |
Adaptive and Resource-efficient Agentic AI Systems for Mobile and Embedded Devices: A Survey |
针对移动和嵌入式设备,提出自适应且资源高效的Agentic AI系统综述 |
foundation model multimodal |
|
|
| 17 |
LLM-Generated Samples for Android Malware Detection |
利用LLM生成样本增强Android恶意软件检测,提升稀疏数据集性能。 |
large language model |
|
|
| 18 |
In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks |
提出上下文好奇心机制,增强决策预训练Transformer在Bandit任务中的泛化能力 |
large language model |
|
|
| 19 |
Which Programming Language and Model Work Best With LLM-as-a-Judge For Code Retrieval? |
研究代码检索中,编程语言和模型对LLM评判效果的影响,并提出迁移学习方法。 |
large language model |
✅ |
|
| 20 |
From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization |
T2L-Agent:利用LLM和运行时信息实现开源软件漏洞的行级精确定位 |
large language model |
|
|
| 21 |
DiSC-AMC: Token- and Parameter-Efficient Discretized Statistics In-Context Automatic Modulation Classification |
DiSC-AMC:面向token和参数高效的离散化统计量上下文自动调制分类 |
large language model |
|
|
| 22 |
Beyond Token Probes: Hallucination Detection via Activation Tensors with ACT-ViT |
提出ACT-ViT,利用激活张量检测大语言模型中的幻觉问题 |
large language model |
✅ |
|
| 23 |
Predicting Effects, Missing Distributions: Evaluating LLMs as Human Behavior Simulators in Operations Management |
评估LLM在运营管理中作为人类行为模拟器的能力:效果预测与分布对齐 |
chain-of-thought |
|
|
| 24 |
The Pitfalls of KV Cache Compression |
揭示KV缓存压缩在多指令场景下的缺陷,并提出改进方案 |
instruction following |
|
|
| 25 |
Thoughtbubbles: an Unsupervised Method for Parallel Thinking in Latent Space |
提出Thoughtbubbles,一种在隐空间进行并行自适应计算的无监督Transformer方法。 |
chain-of-thought |
|
|
| 26 |
LoRAFusion: Efficient LoRA Fine-Tuning for LLMs |
LoRAFusion:面向LLM的高效LoRA微调系统,加速单任务和多任务微调。 |
large language model |
✅ |
|
| 27 |
GRPO-$λ$: Credit Assignment improves LLM Reasoning |
GRPO-λ:通过改进信用分配提升大型语言模型的推理能力 |
large language model |
|
|
| 28 |
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning |
PrunedLoRA:通过梯度结构化剪枝,为微调中的低秩自适应提供鲁棒性。 |
large language model |
|
|
| 29 |
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls |
Transformer难以学习乘法:逆向工程揭示长程依赖的陷阱 |
chain-of-thought |
|
|
| 30 |
Estimating Dimensionality of Neural Representations from Finite Samples |
提出一种偏差校正的维度估计器,用于解决神经表征维度估计中样本量依赖问题。 |
large language model |
|
|
| 31 |
TASP: Topology-aware Sequence Parallelism |
提出TASP,利用拓扑感知序列并行加速长文本大模型训练。 |
large language model |
✅ |
|
| 32 |
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size |
AdaBlock-dLLM:通过自适应块大小实现语义感知的扩散LLM推理 |
large language model |
|
|
| 33 |
Are neural scaling laws leading quantum chemistry astray? |
揭示神经标度律在量子化学中面临的挑战:单纯扩大模型和数据规模不足以保证可靠性 |
foundation model |
|
|
| 34 |
Beyond Linear Probes: Dynamic Safety Monitoring for Language Models |
提出截断多项式分类器,用于大语言模型动态安全监控,实现计算效率与安全性的平衡。 |
large language model |
✅ |
|
| 35 |
Muon Outperforms Adam in Tail-End Associative Memory Learning |
Muon优化器在长尾关联记忆学习中优于Adam,提升尾部类别学习效果 |
large language model |
|
|
| 36 |
Better Privilege Separation for Agents by Restricting Data Types |
提出类型约束特权分离方法,系统性防御AI Agent中的提示注入攻击。 |
large language model |
|
|
| 37 |
Rotation Control Unlearning: Quantifying and Controlling Continuous Unlearning for LLM with The Cognitive Rotation Space |
提出旋转控制卸载(RCU)方法,解决LLM持续卸载中的灾难性效用损失问题。 |
large language model |
|
|