| 1 |
Towards Foundation Models for Relational Databases with Language Models and Graph Neural Networks |
提出结合语言模型和图神经网络的关系数据库Foundation模型 |
foundation model |
|
|
| 2 |
SaaS-Bench: Can Computer-Use Agents Leverage Real-World SaaS to Solve Professional Workflows? |
SaaS-Bench:评估计算机使用Agent在真实SaaS环境中解决专业工作流的能力 |
large language model multimodal |
✅ |
|
| 3 |
DRS-GUI: Dynamic Region Search for Training-Free GUI Grounding |
DRS-GUI:免训练动态区域搜索,提升GUI界面元素定位精度 |
large language model multimodal |
|
|
| 4 |
See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation |
OmniManim:基于视觉先验的空间感知教育动画生成框架 |
large language model |
|
|
| 5 |
Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP |
在对抗性POMDP中,研究复合LLM Agent设计的成本效益,并提出优化策略。 |
chain-of-thought |
|
|
| 6 |
PrismQuant: Rate-Distortion-Optimal Vector Quantization for Gaussian-Mixture Sources |
PrismQuant:针对高斯混合源的率失真最优矢量量化方法 |
multimodal |
|
|
| 7 |
Prospective multi-pathogen disease forecasting using autonomous LLM-guided tree search |
提出基于LLM引导树搜索的自主多病原体疾病预测系统,克服人工建模瓶颈。 |
large language model |
|
|
| 8 |
Reasoners or Translators? Contamination-aware Evaluation and Neuro-Symbolic Robustness in Tax Law |
提出污染感知评估方法,并验证神经符号框架在税法推理中更具鲁棒性和泛化性。 |
large language model |
|
|
| 9 |
Toward Natural and Companionable Virtual Agents via Cross-Temporal Emotional Modeling |
提出跨时间情感建模框架CTEM,提升虚拟陪伴型Agent的自然性和连贯性 |
foundation model |
|
|
| 10 |
Can We Trust AI-Inferred User States. A Psychometric Framework for Validating the Reliability of Users States Classification by LLMs in Operational Environments |
提出评估框架,验证LLM推断用户状态的可靠性,提升自适应系统AI设计的可信度。 |
large language model |
|
|
| 11 |
Position: Early-Stage Quality Assurance in Annotation Pipelines Is More Cost-Effective Than Late-Stage Validation |
在标注流程中,早期质量保证比后期验证更具成本效益 |
foundation model |
|
|
| 12 |
ColPackAgent: Agent-Skill-Guided Hard-Particle Monte Carlo Workflows for Colloidal Packing |
提出ColPackAgent,通过Agent-Skill引导的硬粒子蒙特卡洛工作流进行胶体堆积模拟 |
large language model |
|
|
| 13 |
A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM |
PrismLLM:利用少量GPU实现大规模LLM训练的忠实仿真 |
large language model |
|
|
| 14 |
Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis |
Neo:利用Agentic程序分析检测Polyglot微服务中的权限提升漏洞 |
large language model |
|
|
| 15 |
RTL-BenchMT: Dynamic Maintenance of RTL Generation Benchmark Through Agent-Assisted Analysis and Revision |
提出RTL-BenchMT框架,利用智能体辅助动态维护RTL生成基准测试集。 |
large language model |
|
|
| 16 |
CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning |
提出CAPS:级联自适应配对选择,用于高效并行推理 |
large language model |
|
|