| 1 |
The Geometry of Thought: How Scale Restructures Reasoning In Large Language Models |
揭示大语言模型推理的几何结构:规模扩展重塑推理方式 |
large language model chain-of-thought |
|
|
| 2 |
KOCO-BENCH: Can Large Language Models Leverage Domain Knowledge in Software Development? |
KOCO-BENCH:评估大语言模型在领域知识驱动的软件开发中的能力 |
large language model |
✅ |
|
| 3 |
Explicit Cognitive Allocation: A Principle for Governed and Auditable Inference in Large Language Models |
提出显式认知分配原则,提升大语言模型推理过程的可控性和可追溯性 |
large language model |
|
|
| 4 |
Integrating Virtual Reality and Large Language Models for Team-Based Non-Technical Skills Training and Evaluation in the Operating Room |
VORTeX:结合VR与LLM,用于手术室团队非技术技能培训与评估 |
large language model |
|
|
| 5 |
A Lightweight Modular Framework for Constructing Autonomous Agents Driven by Large Language Models: Design, Implementation, and Applications in AgentForge |
AgentForge:轻量级模块化框架,赋能大语言模型驱动的自主Agent构建 |
large language model |
|
|
| 6 |
Scientific production in the era of Large Language Models |
大型语言模型显著提升科研论文产量,但可能降低论文质量并改变引用模式 |
large language model |
|
|
| 7 |
CORVUS: Red-Teaming Hallucination Detectors via Internal Signal Camouflage in Large Language Models |
CORVUS:通过内部信号伪装对抗大语言模型幻觉检测器 |
large language model |
|
|
| 8 |
An Evolutionary Framework for Automatic Optimization Benchmark Generation via Large Language Models |
提出进化框架以自动生成优化基准测试 |
large language model |
|
|
| 9 |
Vision Language Models for Optimization-Driven Intent Processing in Autonomous Networks |
IntentOpt:评估视觉语言模型在自治网络中优化驱动意图处理的能力 |
large language model multimodal |
|
|
| 10 |
Tracing the Data Trail: A Survey of Data Provenance, Transparency and Traceability in LLMs |
综述LLM数据溯源、透明性和可追溯性,填补训练数据生命周期不透明的空白。 |
large language model |
|
|
| 11 |
SCULPT: Constraint-Guided Pruned MCTS that Carves Efficient Paths for Mathematical Reasoning |
SCULPT:约束引导的剪枝MCTS,为数学推理规划高效路径 |
large language model |
|
|
| 12 |
Real-Time Deadlines Reveal Temporal Awareness Failures in LLM Strategic Dialogues |
揭示LLM在战略对话中对实时截止时间的感知缺陷 |
large language model |
|
|
| 13 |
Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching |
提出基于Agentic AI、嵌套学习和语义缓存的提示注入缓解方法,提升LLM安全性与可持续性。 |
large language model |
|
|
| 14 |
ArchAgent: Scalable Legacy Software Architecture Recovery with LLMs |
ArchAgent:利用LLM实现大规模遗留软件架构的可扩展恢复 |
large language model |
✅ |
|
| 15 |
Beyond Accuracy: Characterizing Code Comprehension Capabilities in (Large) Language Models |
提出诊断框架以评估大型语言模型的代码理解能力 |
large language model |
|
|
| 16 |
On the Evidentiary Limits of Membership Inference for Copyright Auditing |
研究表明,针对LLM的成员推断攻击在版权审计中证据力不足 |
large language model |
|
|
| 17 |
MirrorGuard: Toward Secure Computer-Use Agents via Simulation-to-Real Reasoning Correction |
MirrorGuard:通过模拟到真实推理校正增强计算机使用代理的安全性 |
foundation model |
✅ |
|
| 18 |
Empowering All-in-Loop Health Management of Spacecraft Power System in the Mega-Constellation Era via Human-AI Collaboration |
提出SpaceHMchat人机协作框架,赋能巨型星座时代航天器电源系统全环健康管理 |
TAMP |
|
|