| 1 |
When Continue Learning Meets Multimodal Large Language Model: A Survey |
综述多模态大语言模型持续学习,应对灾难性遗忘难题。 |
large language model multimodal |
|
|
| 2 |
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts |
R2-T2:为多模态混合专家模型提出测试时重路由方法,提升下游任务性能。 |
large language model multimodal |
|
|
| 3 |
MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems |
MMSciBench:中文多模态科学问题语言模型评测基准 |
large language model multimodal |
|
|
| 4 |
SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model |
SeisMoLLM:利用跨模态迁移和预训练大语言模型推进地震监测 |
large language model foundation model |
|
|
| 5 |
Evaluating System 1 vs. 2 Reasoning Approaches for Zero-Shot Time Series Forecasting: A Benchmark and Insights |
ReC4TS:首个零样本时间序列预测推理能力评估基准与洞察 |
large language model foundation model multimodal |
✅ |
|
| 6 |
Conformal Tail Risk Control for Large Language Model Alignment |
提出基于Conformal Risk Control的LLM对齐框架,解决人机评分偏差导致的尾部风险控制问题。 |
large language model |
|
|
| 7 |
Large Language Models as Attribution Regularizers for Efficient Model Training |
提出基于LLM归因正则化的高效模型训练方法,提升小模型在少样本学习中的性能。 |
large language model |
|
|
| 8 |
Mixtera: A Data Plane for Foundation Model Training |
Mixtera:用于大模型训练的数据平面,支持声明式数据混合与动态调整。 |
foundation model |
|
|
| 9 |
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training |
提出双重目的训练方法,通过token区分学习与遗忘,缓解大型语言模型中的成员推理攻击。 |
large language model |
|
|
| 10 |
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models |
提出大型语言模型表征工程的分类、机遇与挑战,实现更有效、可解释的行为控制。 |
large language model |
|
|
| 11 |
Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models |
提出MuCIL模型,解决增量学习中概念-类别关系的保持与增强问题 |
multimodal |
|
|
| 12 |
Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription |
提出多模态大语言模型以解决多页手写文档转录问题 |
large language model |
|
|
| 13 |
Stochastic Rounding for LLM Training: Theory and Practice |
提出基于随机舍入的BF16训练策略,提升LLM训练效率与稳定性 |
large language model |
|
|
| 14 |
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers |
SoS1:类O1和R1推理的LLM是平方和求解器,显著提升多项式非负性判定能力。 |
large language model |
|
|
| 15 |
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis |
分析Web AI Agent脆弱性:揭示其相比独立LLM更易受攻击的原因 |
large language model |
|
|
| 16 |
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation |
提出PhantomWiki,用于按需生成数据集,评估LLM的推理和检索能力。 |
large language model |
✅ |
|
| 17 |
Mixture of Experts for Recognizing Depression from Interview and Reading Tasks |
提出基于专家混合模型的抑郁症语音识别方法,融合访谈和阅读任务语音。 |
multimodal |
|
|
| 18 |
AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs |
AutoHete:一种自动高效的LLM异构训练系统,提升训练吞吐量。 |
large language model |
|
|
| 19 |
SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks |
SkipPipe:异构网络下LLM训练的部分重排序流水线框架 |
large language model |
✅ |
|
| 20 |
MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning |
MobiLLM:通过服务器辅助的侧边调优,在移动设备上实现LLM微调 |
large language model |
|
|
| 21 |
Implicit Search via Discrete Diffusion: A Study on Chess |
提出DiffuSearch,通过离散扩散模型进行隐式搜索,提升AI在棋类游戏中的规划能力。 |
large language model |
✅ |
|
| 22 |
Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents |
自适应攻击破解针对LLM Agent间接提示注入攻击的防御 |
large language model |
✅ |
|