| 1 |
FMDLlama: Financial Misinformation Detection based on Large Language Models |
FMDLlama:基于Llama3.1微调的金融虚假信息检测大语言模型 |
large language model instruction following |
✅ |
|
| 2 |
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models |
HelloBench:评估大型语言模型长文本生成能力的综合基准 |
large language model |
✅ |
|
| 3 |
Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection |
提出领域数据库知识注入方法,提升大语言模型在Text-to-SQL任务中的能力。 |
large language model |
|
|
| 4 |
Exploring the traditional NMT model and Large Language Model for chat translation |
提出基于MBR自训练的模型以提升聊天翻译性能 |
large language model |
|
|
| 5 |
CHBench: A Chinese Dataset for Evaluating Health in Large Language Models |
提出CHBench:首个面向中文大语言模型健康安全评估的综合基准 |
large language model |
✅ |
|
| 6 |
XTRUST: On the Multilingual Trustworthiness of Large Language Models |
XTRUST:首个多语言大语言模型可信度评测基准 |
large language model |
✅ |
|
| 7 |
Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs |
提出基于拼音-汉字预训练的大语言模型,提升语音识别性能 |
large language model multimodal |
|
|
| 8 |
Strategies for Improving NL-to-FOL Translation with LLMs: Data Generation, Incremental Fine-Tuning, and Verification |
提出数据生成、增量微调和验证策略,提升LLM的NL-to-FOL翻译性能 |
large language model |
|
|
| 9 |
A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions |
全面综述LLM中的偏见:现状与未来方向 |
large language model |
|
|
| 10 |
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework |
提出基于反事实提示的RAG风险控制框架,提升模型置信度评估能力 |
large language model |
✅ |
|
| 11 |
AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment |
研究表明:LLM在批量相关性评估中存在阈值启动效应认知偏差 |
large language model |
|
|
| 12 |
Finetuning LLMs for Comparative Assessment Tasks |
微调LLM用于比较评估任务,提升效率与性能 |
large language model |
|
|
| 13 |
SLIMER-IT: Zero-Shot NER on Italian Language |
提出SLIMER-IT,一种面向意大利语的零样本命名实体识别方法。 |
large language model |
|
|
| 14 |
HLB: Benchmarking LLMs' Humanlikeness in Language Use |
HLB:构建LLM语言使用人性的综合评测基准 |
large language model |
✅ |
|
| 15 |
NER-Luxury: Named entity recognition for the fashion and luxury domain |
提出NER-Luxury模型以解决时尚奢侈品领域命名实体识别问题 |
large language model |
|
|
| 16 |
Small Language Models: Survey, Measurements, and Insights |
全面评测与分析小型语言模型,洞察设备端智能的未来 |
large language model |
|
|
| 17 |
Making Text Embedders Few-Shot Learners |
提出bge-en-icl模型,利用LLM的ICL能力提升文本嵌入质量,达到SOTA性能。 |
large language model |
✅ |
|
| 18 |
A Survey of Stance Detection on Social Media: New Directions and Perspectives |
社交媒体立场检测综述:探讨新方向与未来视角 |
large language model |
|
|
| 19 |
FLEX: Expert-level False-Less EXecution Metric for Reliable Text-to-SQL Benchmark |
FLEX:一种专家级无误判的Text-to-SQL评估指标,提升基准测试可靠性 |
large language model |
|
|
| 20 |
Qualitative Insights Tool (QualIT): LLM Enhanced Topic Modeling |
QualIT:利用LLM增强主题建模,提升主题连贯性和多样性 |
large language model |
|
|