| 1 |
The Future of MLLM Prompting is Adaptive: A Comprehensive Experimental Evaluation of Prompt Engineering Methods for Robust Multimodal Performance |
多模态大语言模型提示工程自适应优化:全面实验评估提升模型鲁棒性 |
large language model multimodal chain-of-thought |
|
|
| 2 |
See or Recall: A Sanity Check for the Role of Vision in Solving Visualization Question Answer Tasks with Multimodal LLMs |
提出VisQA数据集的健全性检查框架,区分多模态LLM的视觉推理与知识回忆能力 |
large language model multimodal |
|
|
| 3 |
MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and Discovery |
MatterTune:集成化平台,微调原子级预训练模型,加速材料模拟与发现 |
foundation model |
|
|
| 4 |
Building Trustworthy Multimodal AI: A Review of Fairness, Transparency, and Ethics in Vision-Language Tasks |
综述:构建可信赖的多模态AI,关注视觉-语言任务中的公平性、透明性和伦理 |
multimodal |
|
|
| 5 |
Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE |
优化数据分布与内核性能,加速化学基础模型MACE的训练。 |
foundation model |
|
|
| 6 |
Can Competition Enhance the Proficiency of Agents Powered by Large Language Models in the Realm of News-driven Time Series Forecasting? |
提出基于竞争机制的多智能体新闻驱动时序预测框架,提升LLM在金融预测中的创新能力。 |
large language model |
|
|
| 7 |
Investigating cybersecurity incidents using large language models in latest-generation wireless networks |
利用大型语言模型检测最新一代无线网络中的网络安全事件 |
large language model |
|
|
| 8 |
Working with Large Language Models to Enhance Messaging Effectiveness for Vaccine Confidence |
利用大型语言模型增强疫苗信任信息的有效性 |
large language model |
|
|
| 9 |
A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science |
综述:大语言模型驱动的跨尺度空间智能研究进展 |
large language model |
|
|
| 10 |
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models |
StruPhantom:针对黑盒表格代理的进化注入攻击,利用大语言模型 |
large language model |
|
|
| 11 |
Zero-shot Autonomous Microscopy for Scalable and Intelligent Characterization of 2D Materials |
ATOMIC:利用基础模型实现二维材料的零样本自主显微表征 |
large language model foundation model |
|
|
| 12 |
RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users |
RealWebAssist:一个面向真实用户的长时程Web辅助基准测试 |
instruction following |
|
|
| 13 |
Hierarchical Knowledge Graphs for Story Understanding in Visual Narratives |
提出层级知识图谱框架,用于视觉叙事中的故事结构化语义理解 |
multimodal |
|
|
| 14 |
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization |
通过任务泛化构建GUI智能体,突破数据壁垒 |
multimodal |
✅ |
|
| 15 |
Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling? |
利用LLM辅助概率因果建模专家知识获取,提升生物识别与医疗决策透明度。 |
large language model |
|
|
| 16 |
SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning |
SymRTLO:利用LLM和神经符号推理增强RTL代码优化 |
large language model |
|
|
| 17 |
Understanding and Optimizing Multi-Stage AI Inference Pipelines |
HERMES:用于理解和优化多阶段AI推理流水线的异构执行模拟器 |
large language model |
|
|
| 18 |
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference |
AlayaDB:用于高效长文本LLM推理的数据底座 |
large language model |
|
|
| 19 |
The Code Barrier: What LLMs Actually Understand? |
通过代码混淆评估LLM代码理解能力,揭示通用模型意外的鲁棒性 |
foundation model |
|
|
| 20 |
LLM-Driven NPCs: Cross-Platform Dialogue System for Games and Social Platforms |
提出基于LLM的跨平台NPC对话系统,实现游戏与社交平台互动 |
large language model |
|
|
| 21 |
Privacy Meets Explainability: Managing Confidential Data and Transparency Policies in LLM-Empowered Science |
DataShield:面向LLM赋能科研的数据泄露检测与隐私策略管理框架 |
large language model |
|
|
| 22 |
Automated Validation of COBOL to Java Transformation |
提出基于符号执行的自动化验证框架,用于COBOL到Java代码转换的正确性验证。 |
large language model |
|
|
| 23 |
Automated Testing of COBOL to Java Transformation |
提出COBOL到Java转换的自动化测试框架,解决LLM转换代码的验证难题 |
large language model |
|
|
| 24 |
EthosGPT: Mapping Human Value Diversity to Advance Sustainable Development Goals (SDGs) |
EthosGPT:通过映射人类价值观多样性促进可持续发展目标 |
large language model |
|
|
| 25 |
PestMA: LLM-based Multi-Agent System for Informed Pest Management |
PestMA:基于LLM的多智能体系统,用于智能病虫害管理 |
large language model |
|
|
| 26 |
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning |
提出自适应多智能体框架,通过模型训练和系统协调增强协同推理能力 |
large language model |
✅ |
|