| 1 |
Yuan3.0 Flash: An Open Multimodal Large Language Model for Enterprise Applications |
Yuan3.0 Flash:面向企业应用,开源多模态大语言模型,采用RAPO优化推理。 |
large language model multimodal |
✅ |
|
| 2 |
Streaming Hallucination Detection in Long Chain-of-Thought Reasoning |
提出流式幻觉检测方法,用于长链式思考推理中实时识别和解释幻觉。 |
large language model chain-of-thought |
|
|
| 3 |
MMP-A*: Multimodal Perception Enhanced Incremental Heuristic Search on Path Planning |
MMP-A*:多模态感知增强的路径规划增量启发式搜索 |
large language model multimodal |
|
|
| 4 |
Perish or Flourish? A Holistic Evaluation of Large Language Models for Code Generation in Functional Programming |
FPEval:全面评估LLM在函数式编程代码生成中的性能与风格 |
large language model |
|
|
| 5 |
Exploring Approaches for Detecting Memorization of Recommender System Data in Large Language Models |
探索大语言模型中推荐系统数据记忆的检测方法,并评估自动化提示工程的潜力 |
large language model |
|
|
| 6 |
MindChat: A Privacy-preserving Large Language Model for Mental Health Support |
提出MindChat:一种保护隐私的心理健康支持大语言模型 |
large language model |
|
|
| 7 |
LIA: Supervised Fine-Tuning of Large Language Models for Automatic Issue Assignment |
LIA:通过监督式微调大型语言模型实现自动问题分配 |
large language model |
|
|
| 8 |
Project Ariadne: A Structural Causal Framework for Auditing Faithfulness in LLM Agents |
Project Ariadne:提出基于结构因果模型的LLM Agent推理忠实性审计框架 |
large language model chain-of-thought |
|
|
| 9 |
Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling |
Falcon-H1R:利用混合模型和高效测试时扩展,突破推理性能边界 |
chain-of-thought |
|
|
| 10 |
Placement Semantics for Distributed Deep Learning: A Systematic Framework for Analyzing Parallelism Strategies |
提出Placement Semantics框架,系统分析分布式深度学习并行策略。 |
large language model |
|
|
| 11 |
Theory Trace Card: Theory-Driven Socio-Cognitive Evaluation of LLMs |
提出Theory Trace Card,用于理论驱动的大语言模型社会认知能力评估。 |
large language model |
|
|
| 12 |
Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios |
Jenius Agent:面向真实场景,经验驱动的LLM Agent精度优化框架 |
large language model |
|
|
| 13 |
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs |
COMPASS:评估LLM在组织特定策略对齐的框架 |
large language model |
|
|
| 14 |
Yukthi Opus: A Multi-Chain Hybrid Metaheuristic for Large-Scale NP-Hard Optimization |
提出Yukthi Opus混合元启发式算法,解决大规模NP难优化问题,适用于评估预算受限场景。 |
multimodal |
|
|
| 15 |
A New Benchmark for the Appropriate Evaluation of RTL Code Optimization |
RTL-OPT:用于评估LLM在RTL代码优化能力的新基准测试集 |
large language model |
|
|
| 16 |
Query-Document Dense Vectors for LLM Relevance Judgment Bias Analysis |
提出基于密集向量聚类的框架,用于分析LLM在相关性判断中的偏差 |
large language model |
|
|