| 1 |
NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models |
NUMCoT:利用大语言模型进行链式推理中数字和计量单位的处理研究 |
large language model chain-of-thought |
|
|
| 2 |
Wings: Learning Multimodal LLMs without Text-only Forgetting |
Wings:一种解决多模态LLM中文本遗忘问题的新型架构 |
large language model multimodal |
|
|
| 3 |
Large Language Models as Evaluators for Recommendation Explanations |
探索大语言模型作为推荐解释评估器的可行性,提升评估效率与一致性 |
large language model instruction following |
✅ |
|
| 4 |
Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models |
提出Docs2KG框架,利用大语言模型从异构文档中构建统一知识图谱 |
large language model multimodal |
|
|
| 5 |
Exploring Multilingual Large Language Models for Enhanced TNM classification of Radiology Report in lung cancer staging |
利用多语言大语言模型提升肺癌分期中放射报告的TNM分类 |
large language model |
|
|
| 6 |
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models |
IrokoBench:面向非洲语言的大语言模型评测基准 |
large language model |
|
|
| 7 |
Queue management for slo-oriented large language model serving |
QLM:面向松弛延迟需求的LLM服务队列管理系统,提升资源利用率。 |
large language model |
|
|
| 8 |
Automating Turkish Educational Quiz Generation Using Large Language Models |
提出 Turkish-Quiz-Instruct 数据集,并利用大语言模型自动生成土耳其语教育测验 |
large language model |
|
|
| 9 |
Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models |
评估Llama大型语言模型在符号推理方面的涌现能力 |
large language model |
|
|
| 10 |
FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models |
FragRel:利用片段级关系增强大语言模型外部记忆,提升长文本处理能力 |
large language model |
|
|
| 11 |
Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining |
提出Bi-Chainer双向链式推理,提升LLM在复杂逻辑问题上的推理精度与效率 |
large language model |
|
|
| 12 |
Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models |
揭示大语言模型中的选择偏差:探究顺序和Token敏感性 |
large language model |
|
|
| 13 |
Evaluating the Efficacy of Large Language Models in Detecting Fake News: A Comparative Analysis |
对比评估大型语言模型在假新闻检测中的有效性,为信息完整性提供参考。 |
large language model |
|
|
| 14 |
Improve Mathematical Reasoning in Language Models by Automated Process Supervision |
提出OmegaPRM算法,实现数学推理语言模型的自动化过程监督 |
large language model chain-of-thought |
|
|
| 15 |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
分析LLM在对话摘要中的行为,揭示情境性幻觉趋势 |
large language model |
|
|
| 16 |
PatentEval: Understanding Errors in Patent Generation |
PatentEval:提出专利生成错误类型学,用于评估语言模型在专利文本生成任务中的表现 |
large language model |
|
|
| 17 |
TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools |
TACT:利用信息抽取工具提升复杂聚合推理能力 |
large language model |
|
|
| 18 |
Cycles of Thought: Measuring LLM Confidence through Stable Explanations |
提出基于解释稳定性的LLM置信度评估框架,提升不确定性量化效果 |
large language model |
|
|
| 19 |
StatBot.Swiss: Bilingual Open Data Exploration in Natural Language |
发布StatBot.Swiss双语数据集,评估LLM在Text-to-SQL任务中的泛化能力。 |
large language model |
|
|
| 20 |
Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework |
提出基于马尔可夫链的多智能体辩论框架,用于检测LLM的幻觉问题 |
large language model |
|
|
| 21 |
Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs Understand Textile Hand? |
探索人机感知对齐:大型语言模型能否理解纺织品手感? |
large language model |
|
|
| 22 |
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents |
BadAgent:在LLM Agent中植入和激活后门攻击 |
large language model |
✅ |
|
| 23 |
HYDRA: Model Factorization Framework for Black-Box LLM Personalization |
HYDRA:一种用于黑盒LLM个性化的模型分解框架 |
large language model |
✅ |
|
| 24 |
LLM as a Scorer: The Impact of Output Order on Dialogue Evaluation |
研究LLM作为评分器时,输出顺序对对话评估的影响 |
large language model |
|
|