| 1 |
Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models |
提出EvoJail框架以自动化发现长尾攻击策略 |
large language model |
|
|
| 2 |
Embodied Science: Closing the Discovery Loop with Agentic Embodied AI |
提出具身科学范式,利用具身AI闭环解决科学发现难题 |
embodied AI |
|
|
| 3 |
AI Agents Can Already Autonomously Perform Experimental High Energy Physics |
AI Agent自主执行高能物理实验分析,加速科研流程 |
large language model |
|
|
| 4 |
Learning Dynamic Belief Graphs for Theory-of-mind Reasoning |
提出动态信念图模型,增强LLM在复杂环境中基于心理理论的推理能力 |
large language model |
|
|
| 5 |
Pitfalls in Evaluating Interpretability Agents |
提出无监督内在评估方法以解决自动可解释性系统评估挑战 |
large language model |
|
|
| 6 |
Agentic Harness for Real-World Compilers |
提出llvm-autofix,用于辅助LLM智能体理解和修复LLVM编译器漏洞。 |
large language model |
✅ |
|
| 7 |
Utility-Guided Agent Orchestration for Efficient LLM Tool Use |
提出效用引导的代理编排以优化LLM工具使用效率 |
large language model |
|
|
| 8 |
Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification |
提出Stepwise神经符号框架,用于自动化系统验证中的定理证明搜索。 |
large language model |
|
|
| 9 |
Skilled AI Agents for Embedded and IoT Systems Development |
提出基于技能的AI Agent框架,用于硬件在环嵌入式和物联网系统开发 |
large language model |
|
|
| 10 |
Optimal Scalar Quantization for Matrix Multiplication: Closed-Form Density and Phase Transition |
提出最佳标量量化方法以优化矩阵乘法精度 |
large language model |
|
|