| 1 |
AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption |
AFLoRA:面向异构资源环境,自适应联邦微调大语言模型 |
large language model foundation model |
|
|
| 2 |
Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting |
提出Proxy-FDA,通过代理特征分布对齐解决视觉基础模型微调中的概念遗忘问题 |
foundation model |
|
|
| 3 |
Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework |
提出MCS-Set多模态材料科学框架,融合原子结构、2D投影和文本注释,提升材料性质预测和晶体生成。 |
multimodal |
✅ |
|
| 4 |
Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models |
CATCH-FM:利用大规模医疗健康基础模型进行癌症预筛查 |
foundation model |
|
|
| 5 |
Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models |
利用大型语言模型进行问题分类,扩展数据并评估新模型 |
large language model |
|
|
| 6 |
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models |
利用预训练模型赋能神经符号学习,提升复杂推理任务的泛化性 |
foundation model |
|
|
| 7 |
PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models |
PhySense:提出基于物理原理推理的大语言模型评测基准 |
large language model |
|
|
| 8 |
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts |
提出HELM:基于混合曲率专家模型的双曲空间大型语言模型,提升文本几何结构建模能力。 |
large language model |
|
|
| 9 |
Learning Safety Constraints for Large Language Models |
提出Safety Polytope (SaP)方法,在表征空间中学习并执行LLM安全约束。 |
large language model |
|
|
| 10 |
Equivalent Linear Mappings of Large Language Models |
提出等效线性映射以解析大型语言模型的推理机制 |
large language model |
|
|
| 11 |
Generalisation Bounds of Zero-Shot Economic Forecasting using Time Series Foundation Models |
利用时间序列基础模型实现零样本经济预测,无需定制训练。 |
foundation model |
|
|
| 12 |
Learn from the Past: Fast Sparse Indexing for Large Language Model Decoding |
LFPS:利用历史注意力模式加速长文本LLM解码中的稀疏索引检索 |
large language model |
|
|
| 13 |
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs |
揭示GPT知识文件泄露风险:提出全面评估框架并发现多种泄露途径 |
large language model |
|
|
| 14 |
Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents |
Breakpoint:通过对抗性代码损坏,可扩展地评估LLM代码智能体的系统级推理能力 |
large language model |
|
|
| 15 |
Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States |
提出隐藏状态下差分隐私零阶优化收敛性保证,并改进算法设计 |
large language model |
|
|
| 16 |
Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective |
提出DeconfoundLM,通过因果去混淆提升语言模型在观测数据上的对齐效果 |
large language model |
|
|
| 17 |
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning |
Chameleon:一种灵活的数据混合框架,用于语言模型预训练和微调。 |
large language model |
✅ |
|
| 18 |
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training |
SUMO:子空间感知矩正交化加速内存高效的大语言模型训练 |
large language model |
|
|
| 19 |
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations |
提出PDE-Transformer,用于高效且通用的物理模拟代理建模。 |
foundation model |
|
|
| 20 |
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models |
K-Steering:一种用于语言模型的多属性统一控制非线性方法 |
large language model |
|
|
| 21 |
Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting |
TimeReasoner:探索慢思考LLM在时间序列预测中的推理能力 |
multimodal |
|
|
| 22 |
Object Centric Concept Bottlenecks |
提出Object-Centric Concept Bottlenecks,提升复杂视觉任务中概念瓶颈模型的性能与可解释性。 |
foundation model |
|
|
| 23 |
Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling |
提出DORA:通过优化资源分配,提升大语言模型在测试时推理的效率和准确率 |
large language model |
|
|
| 24 |
ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration |
ReCalKV:通过Head重排序与离线校准实现低秩KV缓存压缩 |
large language model |
✅ |
|
| 25 |
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation |
提出SwiftEval:一个用于评估LLM生成Swift代码能力的高质量基准 |
large language model |
|
|
| 26 |
LittleBit: Ultra Low-Bit Quantization via Latent Factorization |
LittleBit:通过潜在因子分解实现超低比特量化,显著压缩大语言模型。 |
large language model |
✅ |
|
| 27 |
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows |
针对低代码工作流生成,证明微调小型语言模型在质量上优于提示大型语言模型。 |
large language model |
|
|
| 28 |
SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling |
提出SALE:一种低比特估计的稀疏注意力方法,加速长文本LLM Prefilling阶段。 |
large language model |
|
|
| 29 |
Invariant Link Selector for Spatial-Temporal Out-of-Distribution Problem |
提出一种不变链接选择器,解决时序图上的空间-时间分布外泛化问题 |
foundation model |
✅ |
|
| 30 |
Don't Just Follow MLLM Plans: Robust and Efficient Planning for Open-world Agents |
提出REPOA框架,解决开放世界智能体鲁棒高效规划问题 |
large language model |
|
|