| 1 |
NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables |
提出NeedleInATable,用于评估LLM在长结构化表格中的长程上下文理解能力。 |
large language model multimodal |
|
|
| 2 |
RAISE: Reinforced Adaptive Instruction Selection For Large Language Models |
提出RAISE:一种基于强化学习的自适应指令选择框架,用于优化大语言模型的指令微调。 |
large language model |
|
|
| 3 |
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models |
综述:大型语言模型中个性化和多元化偏好对齐技术 |
large language model |
|
|
| 4 |
Exploring the Impact of Personality Traits on Conversational Recommender Systems: A Simulation with Large Language Models |
提出基于LLM的个性化对话推荐系统模拟框架PerCRS,探索人格特质对推荐结果的影响 |
large language model |
|
|
| 5 |
Socrates or Smartypants: Testing Logic Reasoning Capabilities of Large Language Models with Logic Programming-based Test Oracles |
提出SmartyPat-Bench和SmartyPat框架,用于评估和提升LLM的逻辑推理能力。 |
large language model |
|
|
| 6 |
Lugha-Llama: Adapting Large Language Models for African Languages |
Lugha-Llama:通过适配大型语言模型提升非洲语言处理能力 |
large language model |
|
|
| 7 |
Estimating Item Difficulty Using Large Language Models and Tree-Based Machine Learning Algorithms |
利用大型语言模型和树模型预测K-5年级数学和阅读题目的难度 |
large language model |
|
|
| 8 |
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning |
DeduCE:提出基于演绎一致性的框架评估LLM的推理能力 |
large language model chain-of-thought |
|
|
| 9 |
Integrating Cognitive Processing Signals into Language Models: A Review of Advances, Applications and Future Directions |
综述:整合认知处理信号增强语言模型与多模态大语言模型 |
large language model multimodal |
|
|
| 10 |
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation |
Kaleidoscope:大规模多语种视觉评估的语内考试基准 |
multimodal |
|
|
| 11 |
LayerFlow: Layer-wise Exploration of LLM Embeddings using Uncertainty-aware Interlinked Projections |
LayerFlow:利用不确定性感知的互联投影,逐层探索LLM嵌入空间 |
large language model |
|
|
| 12 |
Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization |
提出Alice主动学习框架,利用教师模型示范提升弱到强泛化能力 |
large language model |
|
|
| 13 |
PAYADOR: A Minimalist Approach to Grounding Language Models on Structured Data for Interactive Storytelling and Role-playing Games |
PAYADOR:一种基于结构化数据对语言模型进行交互式故事讲述和角色扮演游戏的基础方法 |
large language model |
|
|
| 14 |
HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation |
HypoEval:一种基于假设引导的自然语言生成评估框架 |
large language model |
|
|
| 15 |
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs |
KG-LLM-Bench:一个可扩展的基准,用于评估LLM在文本化知识图谱上的推理能力 |
large language model |
|
|
| 16 |
HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification |
提出HalluciNot,通过上下文和常识验证检测企业级大语言模型中的幻觉问题。 |
large language model |
|
|
| 17 |
Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety |
评估检索增强生成模型在交通安全文档查询中的应用,RAG-LLaMA表现突出。 |
large language model |
|
|
| 18 |
Towards LLMs Robustness to Changes in Prompt Format Styles |
提出混合格式(MOF)方法,提升LLM对提示格式变化的鲁棒性 |
large language model |
|
|
| 19 |
RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts |
RuOpinionNE-2024:提出俄语新闻文本中观点元组抽取的评测任务 |
large language model |
|
|
| 20 |
Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains |
提出基于大型语言模型的数据增强方法,提升多语言多领域虚假评论检测性能 |
large language model |
|
|
| 21 |
CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization |
CAReDiO:通过代表性和区分性指导的数据优化实现LLM的文化对齐 |
large language model |
|
|
| 22 |
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms |
探讨LLM知识体系的开放性问题,并提出基于上下文知识扩展的未来模型范式 |
large language model |
|
|
| 23 |
Learning Optimal Prompt Ensemble for Multi-source Visual Prompt Transfer |
提出HGPrompt,通过学习最优Prompt集成权重,提升多源视觉Prompt迁移性能 |
foundation model |
|
|
| 24 |
More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce |
提出电商多任务学习框架,提升LLM领域自适应能力 |
large language model |
|
|
| 25 |
SEE: Continual Fine-tuning with Sequential Ensemble of Experts |
提出SEE框架,通过序列专家集成实现大语言模型的持续微调,缓解灾难性遗忘。 |
large language model |
|
|
| 26 |
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning |
ThoughtProbe:利用LLM内在推理能力,通过分类器引导的思维空间探索 |
large language model |
|
|
| 27 |
Automated Business Process Analysis: An LLM-Based Approach to Value Assessment |
利用LLM自动进行业务流程分析,实现价值评估 |
large language model |
|
|
| 28 |
Exploring the Effectiveness and Interpretability of Texts in LLM-based Time Series Models |
研究表明LLM时间序列模型中文本信息的有效性和可解释性有限 |
large language model |
✅ |
|