| 1 |
iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning |
提出iCLP框架,利用隐式认知潜在规划提升LLM推理能力 |
large language model chain-of-thought |
|
|
| 2 |
Beyond Hallucinations: A Composite Score for Measuring Reliability in Open-Source Large Language Models |
提出综合可靠性评分CRS,用于评估开源大语言模型的可靠性 |
large language model |
|
|
| 3 |
Automated Analysis of Sustainability Reports: Using Large Language Models for the Extraction and Prediction of EU Taxonomy-Compliant KPIs |
构建欧盟分类标准KPI数据集,评估大语言模型在可持续性报告分析中的能力。 |
large language model |
|
|
| 4 |
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process |
提出RISE框架,无监督发现LLM推理过程中的行为向量并实现可控干预。 |
large language model chain-of-thought |
|
|
| 5 |
Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning |
提出DATAMASK,通过策略梯度优化大规模预训练数据联合选择,提升训练效率和模型性能。 |
large language model |
|
|
| 6 |
Training a Huggingface Model on AWS Sagemaker (Without Tears) |
提供Hugging Face模型在AWS SageMaker上训练的完整指南,降低云平台使用门槛。 |
large language model |
|
|
| 7 |
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling |
提出基于超图记忆的HGMem,提升多步RAG在长文本复杂关系建模中的性能 |
large language model |
|
|
| 8 |
LimAgents: Multi-Agent LLMs for Generating Research Limitations |
LimAgents:多智能体LLM框架,用于生成更深入的科研论文局限性分析。 |
large language model |
|
|
| 9 |
Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets |
利用OpenDataArena构建高质量训练数据集,提升大语言模型性能 |
large language model |
|
|
| 10 |
Concept Attractors in LLMs and their Applications |
利用LLM中的概念吸引子,无需训练解决翻译、幻觉等问题 |
large language model |
|
|
| 11 |
Efficient Context Scaling with LongCat ZigZag Attention |
提出LongCat ZigZag Attention,加速长文本处理,提升长程推理与Agent能力。 |
foundation model |
|
|
| 12 |
Paragraph Segmentation Revisited: Towards a Standard Task for Structuring Speech |
提出TEDPara和YTSegPara基准,并设计MiniSeg模型,用于提升语音转录文本的段落分割效果。 |
large language model |
|
|
| 13 |
Training Report of TeleChat3-MoE |
TeleChat3-MoE训练报告:构建可靠高效的超大规模MoE模型训练基础设施 |
large language model |
|
|
| 14 |
Activation Steering for Masked Diffusion Language Models |
提出激活调控框架,用于控制掩码扩散语言模型生成文本的属性。 |
large language model |
|
|