| 1 |
Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization |
提出长度控制偏好优化以解决大规模推理模型的效率问题 |
chain-of-thought |
|
|
| 2 |
Mathematical Computation and Reasoning Errors by Large Language Models |
评估大型语言模型在数学计算中的错误以提升教育效果 |
large language model |
|
|
| 3 |
Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification |
利用大型语言模型提升代码审查评论分类的准确性 |
large language model |
|
|
| 4 |
An Automated Multi-modal Evaluation Framework for Mobile Intelligent Assistants Based on Large Language Models and Multi-Agent Collaboration |
提出自动化多模态评估框架以解决智能助手评估问题 |
large language model |
|
|
| 5 |
Using Artificial Intuition in Distinct, Minimalist Classification of Scientific Abstracts for Management of Technology Portfolios |
提出人工直觉方法以实现科学摘要的高效分类 |
large language model |
|
|
| 6 |
KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems |
提出KompeteAI以解决AutoML系统的执行瓶颈与探索不足问题 |
large language model |
|
|
| 7 |
Agentic AI Frameworks: Architectures, Protocols, and Design Challenges |
系统评估Agentic AI框架以解决智能代理通信问题 |
large language model |
|
|
| 8 |
Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development |
通过Amazon Nova AI Challenge推动安全AI辅助软件开发 |
large language model |
|
|
| 9 |
Profile-Aware Maneuvering: A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld |
提出动态多智能体系统以增强GAIA问题求解的鲁棒性 |
large language model |
|
|
| 10 |
The PacifAIst Benchmark:Would an Artificial Intelligence Choose to Sacrifice Itself for Human Safety? |
提出PacifAIst基准以解决AI自我优先行为评估问题 |
large language model |
|
|
| 11 |
UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge |
提出UDA框架以解决大语言模型评估中的偏见问题 |
large language model |
|
|
| 12 |
On Negative-aware Preference Optimization for Recommendation |
提出负样本感知偏好优化方法以提升推荐系统性能 |
large language model |
|
|
| 13 |
AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries? |
提出AmbiGraph-Eval以评估LLMs处理模糊图查询的能力 |
large language model |
|
|
| 14 |
Your Coding Intent is Secretly in the Context and You Should Deliberately Infer It Before Completion |
提出三阶段推理框架以提升代码补全的准确性 |
large language model |
|
|
| 15 |
Hallucination vs interpretation: rethinking accuracy and precision in AI-assisted data extraction for knowledge synthesis |
提出AI辅助数据提取方法以提高知识综合的准确性和效率 |
large language model |
|
|
| 16 |
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference |
提出KV-Cloak以解决LLM推理中的KV缓存隐私风险问题 |
large language model |
|
|