| 1 |
OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence |
提出OmniGeo,一个用于地理空间人工智能的多模态大语言模型 |
large language model multimodal instruction following |
|
|
| 2 |
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models |
探索基于多模态大语言模型的Agentic推荐系统,提升推荐的交互性与适应性 |
large language model multimodal |
|
|
| 3 |
Towards Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach |
提出AgentNet框架,利用生成式基础模型赋能6G网络中自主AI Agent的协作。 |
embodied AI foundation model |
|
|
| 4 |
Large Language Models for Water Distribution Systems Modeling and Decision-Making |
提出基于LLM-EPANET架构的框架,用于水分配系统建模和决策,实现自然语言交互。 |
large language model |
|
|
| 5 |
Echoes of Power: Investigating Geopolitical Bias in US and China Large Language Models |
研究揭示中美大型语言模型在回答地缘政治问题时存在的意识形态和文化偏见 |
large language model |
|
|
| 6 |
Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms |
提出代码演化图,用于分析LLM驱动的算法设计过程,揭示LLM在进化计算中的代码生成模式。 |
large language model |
|
|
| 7 |
Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 |
利用DeepSeek-R1评估大语言模型对社会科学研究的影响 |
large language model |
|
|
| 8 |
Using Large Language Models to Categorize Strategic Situations and Decipher Motivations Behind Human Behaviors |
利用大型语言模型分类策略情境并解读人类行为动机 |
large language model |
|
|
| 9 |
Survey on Evaluation of LLM-based Agents |
全面评测LLM驱动的智能体:基准、框架与未来方向 |
generalist agent |
|
|
| 10 |
The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination |
提出系统评估方法以解决大语言模型基准数据污染问题 |
large language model |
✅ |
|
| 11 |
Palatable Conceptions of Disembodied Being |
探讨具身性缺失的AI系统中意识概念的可能性与哲学挑战 |
embodied AI |
|
|
| 12 |
Unify and Triumph: Polyglot, Diverse, and Self-Consistent Generation of Unit Tests with LLMs |
PolyTest:利用多语言和多样性生成自洽的单元测试,显著提升测试质量 |
large language model |
|
|
| 13 |
Autonomous AI imitators increase diversity in homogeneous information ecosystems |
自主AI模仿者在同质化信息生态系统中增加多样性 |
large language model |
|
|
| 14 |
GAN-enhanced Simulation-driven DNN Testing in Absence of Ground Truth |
提出GAN增强的模拟驱动DNN测试方法,解决无真值标签下的测试难题 |
large language model |
|
|
| 15 |
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment |
V-Droid:一种基于验证器的移动GUI代理,提升任务自动化性能与效率 |
large language model |
✅ |
|
| 16 |
DeepPsy-Agent: A Stage-Aware and Deep-Thinking Emotional Support Agent System |
DeepPsy-Agent:结合心理学三阶段理论与深度学习的情感支持智能体系统 |
large language model |
|
|
| 17 |
Entropy-based Exploration Conduction for Multi-step Reasoning |
提出Entro-duction,通过熵引导LLM进行多步推理的探索深度调整。 |
large language model |
|
|
| 18 |
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing |
提出Attention Pruning,通过代理模拟退火自动修复语言模型中的偏见。 |
large language model |
|
|
| 19 |
ChatGPT and U(X): A Rapid Review on Measuring the User Experience |
快速综述ChatGPT用户体验评估方法,填补标准化评估体系的空白。 |
large language model |
|
|
| 20 |
Detecting LLM-Generated Peer Reviews |
提出一种基于隐蔽水印的LLM生成同行评审检测框架,提升检测可靠性。 |
large language model |
|
|
| 21 |
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration |
AutoRedTeamer:基于终身攻击集成的大语言模型自主红队测试框架 |
large language model |
|
|