| 1 |
Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers |
提出MISS-QA基准,评估多模态模型理解科学论文示意图的能力 |
foundation model multimodal |
|
|
| 2 |
DeepWriter: A Fact-Grounded Multimodal Writing Assistant Based On Offline Knowledge Base |
DeepWriter:基于离线知识库的事实性多模态写作助手,提升专业领域文档生成质量。 |
large language model multimodal |
|
|
| 3 |
MultiVox: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions |
MultiVox:用于评估多模态交互语音助手的新基准 |
large language model multimodal |
|
|
| 4 |
HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong |
提出HKGAI-V1,一个为香港定制的区域主权大语言模型,关注文化和法律对齐。 |
large language model |
|
|
| 5 |
Enhancing Chain-of-Thought Reasoning with Critical Representation Fine-tuning |
提出CRFT:通过关键表征微调增强Chain-of-Thought推理能力 |
chain-of-thought |
|
|
| 6 |
Cultural Bias in Large Language Models: Evaluating AI Agents through Moral Questionnaires |
揭示大语言模型文化偏见:道德问卷评估AI代理的文化价值观 |
large language model |
|
|
| 7 |
LLMs on Trial: Evaluating Judicial Fairness for Large Language Models |
构建JudiFair数据集与评估框架,揭示LLM在司法公平性上的不足 |
large language model |
|
|
| 8 |
MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking |
提出基于多层大语言模型的MLAR,提升机器人流程自动化在招聘申请追踪中的效率。 |
large language model |
|
|
| 9 |
Fusing Large Language Models with Temporal Transformers for Time Series Forecasting |
融合大语言模型与时序Transformer用于时间序列预测 |
large language model |
|
|
| 10 |
Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition |
通过可解释性分析揭示大语言模型在Off-by-One加法任务中的泛化机制 |
large language model |
|
|
| 11 |
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks |
提出CodeJudgeBench,评估LLM作为代码评判者在代码生成、修复和测试生成任务中的性能。 |
large language model |
|
|
| 12 |
Can You Detect the Difference? |
系统比较扩散模型与自回归模型生成文本的差异,揭示现有检测器的局限性 |
large language model |
|
|
| 13 |
Retention analysis of edited knowledge after fine-tuning |
研究微调对编辑后知识的遗忘效应,并提出增强知识保留的方法 |
large language model |
|
|
| 14 |
From Words to Proverbs: Evaluating LLMs Linguistic and Cultural Competence in Saudi Dialects with Absher |
提出Absher基准,评估LLM在沙特方言中的语言和文化能力 |
large language model |
|
|
| 15 |
Enhancing Retrieval Augmented Generation with Hierarchical Text Segmentation Chunking |
提出基于层级文本分割的RAG增强方法,提升检索信息的语义连贯性与准确性 |
large language model |
|
|
| 16 |
Using AI to replicate human experimental results: a motion study |
利用AI复现人类实验结果:一项基于动作研究的语言学探索 |
large language model |
|
|
| 17 |
Grammar-Guided Evolutionary Search for Discrete Prompt Optimisation |
提出一种基于语法引导进化搜索的离散提示优化方法,提升小模型在复杂任务上的性能。 |
large language model |
|
|
| 18 |
GeLaCo: An Evolutionary Approach to Layer Compression |
提出GeLaCo以解决大语言模型压缩问题 |
large language model |
|
|
| 19 |
Protective Factor-Aware Dynamic Influence Learning for Suicide Risk Prediction on Social Media |
提出保护因素感知的动态影响学习方法,用于社交媒体上的自杀风险预测。 |
large language model |
|
|