| 1 |
Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models |
提出Fill-tuning方法,通过数据高效地改进材料领域预训练模型泛化性能 |
foundation model |
|
|
| 2 |
Smaller But Better: Unifying Layout Generation with Smaller Large Language Models |
提出LGGPT,一种基于小型LLM的统一布局生成模型,在效率和性能间取得平衡。 |
large language model |
✅ |
|
| 3 |
Are Large Language Models In-Context Graph Learners? |
提出基于RAG的框架,提升大语言模型在图学习任务中的上下文学习能力 |
large language model |
|
|
| 4 |
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models |
LoRAM:通过训练小模型、推理大模型,实现大语言模型的高效LoRA训练 |
large language model |
✅ |
|
| 5 |
GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization |
提出GeLLMO:一种基于指令调优的大语言模型,用于多属性分子优化。 |
large language model |
✅ |
|
| 6 |
Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis |
探索代码语言模型在自动化HLS硬件生成中的应用:基准、基础设施与分析 |
large language model chain-of-thought |
|
|
| 7 |
Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language Models |
量化检索增强视觉-语言模型中的记忆与参数响应率,揭示模态差异。 |
large language model multimodal |
|
|
| 8 |
Where's the Bug? Attention Probing for Scalable Fault Localization |
提出Bug Attention Probe (BAP),无需标注实现可扩展的缺陷定位。 |
large language model |
|
|
| 9 |
SPEX: Scaling Feature Interaction Explanations for LLMs |
SPEX:扩展LLM特征交互解释,高效处理长输入 |
large language model |
|
|
| 10 |
Evaluation of EAS directions based on TAIGA HiSCORE data using fully connected neural networks |
利用全连接神经网络评估TAIGA HiSCORE数据中的EAS方向 |
multimodal |
|
|
| 11 |
LESA: Learnable LLM Layer Scaling-Up |
提出LESA以解决大规模语言模型训练成本高的问题 |
large language model |
|
|
| 12 |
Which Attention Heads Matter for In-Context Learning? |
揭示上下文学习的关键:功能向量头而非归纳头驱动LLM的ICL能力 |
large language model |
|
|
| 13 |
Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization |
提出Concept Layers,通过LLM概念化增强LLM的可解释性和可干预性 |
large language model |
|
|
| 14 |
Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts |
提出基于稀疏混合专家模型的分析框架,揭示LLM嵌入空间的分层流形结构。 |
large language model |
|
|
| 15 |
LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation |
LSR-Adapt:利用矩阵低分离秩核自适应的超高效参数调优 |
large language model |
|
|
| 16 |
Megrez-Omni Technical Report |
Megrez系列模型:软硬件协同设计,实现快速、紧凑、鲁棒的端侧智能 |
multimodal |
|
|
| 17 |
An explainable transformer circuit for compositional generalization |
揭示Transformer组合泛化能力:构建可解释的电路并实现模型行为精确控制 |
large language model |
|
|