| 1 |
Leveraging Large Language Models to Identify Conversation Threads in Collaborative Learning |
利用大型语言模型识别协作学习中的对话主题,提升会话分析性能。 |
large language model |
|
|
| 2 |
MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion |
MMPersuade:多模态说服数据集与评估框架,用于评估大型视觉语言模型的说服力 |
multimodal |
|
|
| 3 |
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study |
提出低资源方言自适应方法,利用参数高效微调提升LLM在魁北克法语上的性能 |
large language model |
|
|
| 4 |
Frustratingly Easy Task-aware Pruning for Large Language Models |
提出任务感知剪枝方法,在压缩大语言模型的同时保持特定任务性能。 |
large language model |
|
|
| 5 |
Once Upon an Input: Reasoning via Per-Instance Program Synthesis |
提出Per-Instance Program Synthesis (PIPS)方法,提升LLM在复杂推理任务中的性能。 |
large language model chain-of-thought |
|
|
| 6 |
A Comprehensive Dataset for Human vs. AI Generated Text Detection |
构建大规模人机生成文本检测数据集,促进AI生成内容溯源与鉴别 |
large language model |
✅ |
|
| 7 |
Interpreting and Mitigating Unwanted Uncertainty in LLMs |
探究并缓解大型语言模型中不期望的答案不确定性现象 |
large language model |
|
|
| 8 |
Cross-Lingual Stability and Bias in Instruction-Tuned Language Models for Humanitarian NLP |
针对人道主义NLP,评估指令调优语言模型在跨语言稳定性与偏差上的表现。 |
large language model |
|
|
| 9 |
EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models |
提出EchoMind:一个多层次关联的基准,用于评估具身同理心的语音语言模型 |
instruction following |
|
|
| 10 |
Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models |
研究Transformer和状态空间模型中的时间偏差对上下文学习检索的影响 |
large language model |
|
|
| 11 |
SALSA: Single-pass Autoregressive LLM Structured Classification |
SALSA:单次自回归LLM结构化分类方法,提升文本分类性能 |
large language model |
|
|
| 12 |
Rule-Based Explanations for Retrieval-Augmented LLM Systems |
提出基于规则的解释方法,用于增强检索的大语言模型系统,提升可解释性。 |
large language model |
|
|
| 13 |
AutoBench: Automating LLM Evaluation through Reciprocal Peer Assessment |
AutoBench:通过互惠互评自动评估大型语言模型 |
large language model |
|
|
| 14 |
Pedagogy-driven Evaluation of Generative AI-powered Intelligent Tutoring Systems |
针对生成式AI驱动的智能辅导系统,提出教学法驱动的评估框架 |
large language model |
|
|
| 15 |
SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size |
SABlock:基于语义感知的自适应块大小KV缓存淘汰,提升长文本LLM推理效率 |
large language model |
|
|
| 16 |
LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges? |
LooGLE v2:评估LLM在真实世界长依赖任务中的能力,揭示其局限性。 |
large language model |
|
|
| 17 |
Text to Trust: Evaluating Fine-Tuning and LoRA Trade-offs in Language Models for Unfair Terms of Service Detection |
评估语言模型在不公平服务条款检测中微调与LoRA的权衡 |
large language model |
|
|
| 18 |
CHOIR: Collaborative Harmonization fOr Inference Robustness |
提出CHOIR,通过协同多个角色化LLM推理信号,提升推理鲁棒性。 |
large language model |
|
|