| 1 |
Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs |
分析多模态LLM在未明确和错误指定场景下的推理能力,并提出改进策略。 |
large language model multimodal |
|
|
| 2 |
MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge |
提出MELT:利用LLM嵌入知识自动标注多模态情感数据 |
large language model multimodal |
|
|
| 3 |
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents |
提出Open CaptchaWorld平台,用于评估多模态LLM智能体在验证码任务中的推理与交互能力。 |
multimodal |
|
|
| 4 |
Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models |
AdaCVD:利用大型语言模型从异构数据中进行自适应心血管疾病风险预测 |
large language model |
|
|
| 5 |
The World As Large Language Models See It: Exploring the reliability of LLMs in representing geographical features |
评估大语言模型地理信息表示能力:GPT-4o和Gemini 2.0在地理空间任务中的可靠性分析 |
large language model |
|
|
| 6 |
Gated Multimodal Graph Learning for Personalized Recommendation |
提出RLMultimodalRec,通过门控多模态图学习实现个性化推荐。 |
multimodal |
|
|
| 7 |
Towards Scalable Schema Mapping using Large Language Models |
提出基于大语言模型的可扩展模式映射方法,解决数据集成中的挑战。 |
large language model |
|
|
| 8 |
Generative AI for Urban Design: A Stepwise Approach Integrating Human Expertise with Multimodal Diffusion Models |
提出一种融合人类专业知识的多模态扩散模型,用于城市设计的逐步生成式AI框架。 |
multimodal |
|
|
| 9 |
FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model Evaluation |
提出FABLE基准以评估大型语言模型的数据流推理能力 |
large language model |
|
|
| 10 |
Evaluation of LLMs for mathematical problem solving |
评估大型语言模型在数学问题求解中的能力,揭示不同模型优劣势。 |
large language model chain-of-thought |
|
|
| 11 |
Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success |
提出随机规则森林(RRF),利用LLM生成问题进行可解释的创业成功预测。 |
large language model |
|
|
| 12 |
Chances and Challenges of the Model Context Protocol in Digital Forensics and Incident Response |
探索模型上下文协议在数字取证与事件响应中的应用,提升LLM透明性与可复现性。 |
large language model |
|
|
| 13 |
MIR: Methodology Inspiration Retrieval for Scientific Research Problems |
提出MIR方法,利用方法邻接图MAG提升科研问题的方法灵感检索 |
large language model |
|
|
| 14 |
Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise |
提出基于软提示微调的文化对齐框架,提升LLM的文化敏感性和适应性 |
large language model |
|
|
| 15 |
Tournament of Prompts: Evolving LLM Instructions Through Structured Debates and Elo Ratings |
提出DEEVO:通过辩论驱动的进化算法优化LLM提示,无需预定义指标。 |
large language model |
|
|
| 16 |
A survey of using EHR as real-world evidence for discovering and validating new drug indications |
综述电子病历作为真实世界证据用于新药适应症发现与验证的研究 |
large language model |
|
|
| 17 |
Memory OS of AI Agent |
提出MemoryOS,为AI Agent实现全面高效的记忆管理,提升长期记忆能力和个性化交互体验。 |
large language model |
✅ |
|
| 18 |
Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction |
提出NextLocMoE,利用双层MoE结构和LLM增强的个性化语义感知位置预测。 |
large language model |
|
|
| 19 |
Leveraging Knowledge Graphs and LLMs for Structured Generation of Misinformation |
利用知识图谱和大型语言模型结构化生成虚假信息 |
large language model |
|
|
| 20 |
Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning |
针对复杂推理,优化知识图谱与LLM的接口以提升性能 |
large language model |
|
|
| 21 |
LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs |
LPASS:利用线性探针加速压缩LLM的漏洞检测,提升效率与性能 |
large language model |
|
|
| 22 |
RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation |
RMoA:通过多样性最大化和残差补偿优化混合Agent系统 |
large language model |
✅ |
|
| 23 |
GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments |
GridRoute:基于LLM的网格环境路径规划基准与算法引导提示方法 |
large language model |
✅ |
|
| 24 |
TRAPDOC: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents |
TRAPDOC:通过注入不可察觉的幻影Token欺骗LLM用户,降低过度依赖 |
large language model |
✅ |
|
| 25 |
Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules |
提出QuAda,通过即插即用模块增强LLM在引用感知对话中的能力 |
large language model |
|
|
| 26 |
E^2GraphRAG: Streamlining Graph-based RAG for High Efficiency and Effectiveness |
E^2GraphRAG:优化图RAG,实现高效且有效的知识检索 |
large language model |
|
|
| 27 |
Learning API Functionality from In-Context Demonstrations for Tool-based Agents |
提出一种从上下文演示中学习API功能的方法,用于提升工具型Agent在无文档场景下的任务成功率。 |
large language model |
|
|