| 1 |
Trifuse: Enhancing Attention-Based GUI Grounding via Multimodal Fusion |
Trifuse:通过多模态融合增强基于注意力的GUI元素定位 |
large language model multimodal |
|
|
| 2 |
POP: Online Structural Pruning Enables Efficient Inference of Large Foundation Models |
POP:在线结构剪枝实现大模型高效推理,兼顾精度与速度 |
large language model foundation model |
|
|
| 3 |
Federated Prompt-Tuning with Heterogeneous and Incomplete Multimodal Client Data |
提出异构不完全多模态联邦Prompt Tuning框架,解决跨客户端数据缺失和语义对齐问题。 |
multimodal |
|
|
| 4 |
Is there "Secret Sauce'' in Large Language Model Development? |
大规模语言模型性能主要由算力驱动,但开发者效率差异显著影响非前沿模型 |
large language model |
|
|
| 5 |
Sequences as Nodes for Contrastive Multimodal Graph Recommendation |
提出MuSICRec,通过多模态对比图推荐缓解冷启动和数据稀疏问题。 |
multimodal |
|
|
| 6 |
Multimodal Enhancement of Sequential Recommendation |
提出MuSTRec,融合多模态信息与序列推荐,提升推荐性能。 |
multimodal |
|
|
| 7 |
PreFlect: From Retrospective to Prospective Reflection in Large Language Model Agents |
PreFlect:大型语言模型Agent中从回顾性反思到前瞻性反思的转变 |
large language model |
✅ |
|
| 8 |
ShallowJail: Steering Jailbreaks against Large Language Models |
提出ShallowJail攻击,利用浅层对齐破解大语言模型的安全防护 |
large language model |
✅ |
|
| 9 |
The Quantum Sieve Tracer: A Hybrid Framework for Layer-Wise Activation Tracing in Large Language Models |
提出量子筛追踪器,用于分析大语言模型中的逐层激活追踪,揭示模型架构差异。 |
large language model |
|
|
| 10 |
GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models |
GhostCite:大规模分析大语言模型时代下引文有效性问题 |
large language model |
|
|
| 11 |
Multimodal Generative Retrieval Model with Staged Pretraining for Food Delivery on Meituan |
针对美团外卖场景,提出基于分阶段预训练的多模态生成式检索模型 |
multimodal |
|
|
| 12 |
LogicSkills: A Structured Benchmark for Formal Reasoning in Large Language Models |
LogicSkills:一个用于评估大语言模型形式推理能力的结构化基准 |
large language model |
|
|
| 13 |
Same Answer, Different Representations: Hidden instability in VLMs |
揭示视觉语言模型内部表征不稳定性:相同答案,不同表征 |
multimodal |
|
|
| 14 |
How Well Can LLM Agents Simulate End-User Security and Privacy Attitudes and Behaviors? |
SP-ABCBench评估LLM智能体模拟用户安全隐私态度的能力,发现仍有提升空间 |
large language model |
|
|
| 15 |
TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering |
TamperBench:系统性压力测试LLM在微调和篡改下的安全性 |
large language model |
✅ |
|
| 16 |
TraceCoder: A Trace-Driven Multi-Agent Framework for Automated Debugging of LLM-Generated Code |
TraceCoder:基于运行时追踪的多智能体框架,用于自动调试LLM生成的代码 |
large language model |
|
|
| 17 |
ScaleEnv: Scaling Environment Synthesis from Scratch for Generalist Interactive Tool-Use Agent Training |
ScaleEnv:从零扩展环境合成,用于通用交互式工具使用Agent训练 |
generalist agent |
|
|
| 18 |
Bridging 6G IoT and AI: LLM-Based Efficient Approach for Physical Layer's Optimization Tasks |
提出基于LLM的PE-RTFV框架,用于6G IoT物理层优化 |
large language model |
|
|
| 19 |
Wild Guesses and Mild Guesses in Active Concept Learning |
研究主动概念学习中查询策略对神经符号贝叶斯学习器的影响,揭示了确认偏差的潜在合理性。 |
large language model |
|
|
| 20 |
Evidence for Daily and Weekly Periodic Variability in GPT-4o Performance |
揭示GPT-4o性能的每日和每周周期性波动,挑战时间不变性假设 |
large language model |
|
|
| 21 |
AgentStepper: Interactive Debugging of Software Development Agents |
AgentStepper:用于软件开发Agent交互式调试的工具 |
large language model |
|
|
| 22 |
Lemon Agent Technical Report |
Lemon Agent:基于AgentCortex框架的多智能体协同系统,提升复杂任务处理效率。 |
multimodal |
|
|
| 23 |
HyPER: Bridging Exploration and Exploitation for Scalable LLM Reasoning with Hypothesis Path Expansion and Reduction |
HyPER:通过假设路径扩展与缩减,桥接探索与利用,实现可扩展的LLM推理 |
chain-of-thought |
|
|
| 24 |
Evaluating Retrieval-Augmented Generation Variants for Natural Language-Based SQL and API Call Generation |
评估检索增强生成变体在自然语言到SQL和API调用生成中的应用 |
large language model |
|
|
| 25 |
BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation |
BEAGLE:行为增强的智能体,用于模拟扎根学习者的学习过程 |
large language model |
|
|
| 26 |
Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation |
提出物理一致的程序化生成框架,用于自动创建可执行的结构建模代码。 |
large language model |
✅ |
|
| 27 |
Intrinsic Stability Limits of Autoregressive Reasoning: Structural Consequences for Long-Horizon Execution |
揭示自回归推理的内在稳定性极限,提出长程执行的结构性治理方案 |
large language model |
|
|