| 1 |
Improving the Efficiency of LLM Agent Systems through Trajectory Reduction |
AgentDiet:通过轨迹缩减提升LLM Agent系统效率,降低计算成本。 |
large language model |
|
|
| 2 |
Uncovering Vulnerabilities of LLM-Assisted Cyber Threat Intelligence |
揭示LLM辅助网络威胁情报的脆弱性,提出针对性分析方法 |
large language model |
|
|
| 3 |
Benchmarking LLM-Assisted Blue Teaming via Standardized Threat Hunting |
CyberTeam:通过标准化威胁狩猎基准评估LLM在蓝队行动中的辅助能力 |
large language model |
|
|
| 4 |
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment |
PrefCleanBench:首个LLM对齐偏好数据清洗基准,提升奖励模型质量 |
large language model |
✅ |
|
| 5 |
Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks |
多智能体协同超越最强LLM:多轮交互在基准测试中优于单一大模型 |
large language model |
|
|