| 1 |
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models |
提出高效训练方法,将LLM上下文长度扩展至4M tokens,并保持性能。 |
large language model multimodal instruction following |
✅ |
|
| 2 |
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following |
提出Counterfactual Instruction Following基准,评估LLM在逆向性能角色模拟中的能力 |
large language model instruction following |
|
|
| 3 |
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators |
提出分隔符注入攻击(SIA),揭示角色分隔符导致的大语言模型对话偏见。 |
large language model instruction following |
|
|
| 4 |
BiasCause: Evaluate Socially Biased Causal Reasoning of Large Language Models |
BiasCause:评估大型语言模型中社会偏见的因果推理 |
large language model |
|
|
| 5 |
Exposure to Content Written by Large Language Models Can Reduce Stigma Around Opioid Use Disorder in Online Communities |
利用大型语言模型减少在线社区中对鸦片类药物使用障碍的污名化 |
large language model |
|
|
| 6 |
Assessing how hyperparameters impact Large Language Models' sarcasm detection performance |
研究超参数对大型语言模型讽刺检测性能的影响,Llama-2-13b微调后达到人类水平。 |
large language model |
|
|
| 7 |
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi |
Nanda:面向印地语的10B参数开源生成式大语言模型,性能领先 |
large language model |
|
|
| 8 |
Rank-Then-Score: Enhancing Large Language Models for Automated Essay Scoring |
提出Rank-Then-Score框架,提升大语言模型在自动作文评分任务上的性能 |
large language model |
|
|
| 9 |
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning |
提出基于检索增强逻辑推理的框架,解决大语言模型中的虚假前提幻觉问题 |
large language model |
|
|
| 10 |
Language-Dependent Political Bias in AI: A Study of ChatGPT and Gemini |
研究揭示ChatGPT和Gemini在不同语言中存在政治倾向性偏差。 |
large language model |
|
|
| 11 |
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning |
提出S'MoRE以解决大语言模型微调的参数效率与模型能力平衡问题 |
large language model |
✅ |
|
| 12 |
Query Understanding in LLM-based Conversational Information Seeking |
探索LLM在对话式信息检索中提升查询理解的技术 |
large language model |
|
|
| 13 |
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation |
提出Encoder-Decoder Gemma,通过模型适配提升质量-效率权衡。 |
large language model |
|
|
| 14 |
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups |
揭示LLM针对精神健康群体的攻击性叙事中涌现的偏见 |
large language model |
|
|
| 15 |
QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform |
QGen Studio:一个自适应的问答生成、训练与评估平台 |
large language model |
|
|
| 16 |
NativQA Framework: Enabling LLMs with Native, Local, and Everyday Knowledge |
NativQA框架:利用本地化知识赋能大语言模型 |
large language model |
✅ |
|
| 17 |
Unsupervised Location Mapping for Narrative Corpora |
提出一种无监督位置映射方法,用于在叙事语料库中定位故事轨迹。 |
large language model |
|
|
| 18 |
Enhancing Coreference Resolution with Pretrained Language Models: Bridging the Gap Between Syntax and Semantics |
利用预训练语言模型融合句法和语义信息,提升共指消解性能 |
large language model |
|
|
| 19 |
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts |
提出基于鲁棒优化的LLM对齐框架,提升分布偏移下的性能 |
large language model |
|
|
| 20 |
It's the same but not the same: Do LLMs distinguish Spanish varieties? |
评估大型语言模型区分西班牙语变体的能力,发现GPT-4o表现最佳。 |
large language model |
|
|
| 21 |
SEA-LION: Southeast Asian Languages in One Network |
提出SEA-LION,一个面向东南亚语言的先进多语言LLM。 |
large language model |
|
|
| 22 |
LLM$\times$MapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long Resources |
提出LLM×MapReduce-V2,通过卷积缩放增强LLM处理超长文本生成长文的能力 |
large language model |
✅ |
|
| 23 |
STRIVE: A Think & Improve Approach with Iterative Refinement for Enhancing Question Quality Estimation |
提出STRIVE,通过迭代改进提升大语言模型在问题质量评估中的表现 |
large language model |
|
|
| 24 |
Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis? |
评估零样本/少样本LLM在HR面试转录分析中的能力,揭示其局限性 |
large language model |
|
|