| 1 |
From Individuals to Interactions: Benchmarking Gender Bias in Multimodal Large Language Models from the Lens of Social Relationship |
提出Genres基准以评估多模态大语言模型中的性别偏见 |
large language model multimodal |
|
|
| 2 |
Advanced Financial Reasoning at Scale: A Comprehensive Evaluation of Large Language Models on CFA Level III |
评估大型语言模型在CFA三级考试中的金融推理能力 |
large language model chain-of-thought |
|
|
| 3 |
Two Spelling Normalization Approaches Based on Large Language Models |
提出基于大语言模型的两种拼写规范化方法以解决历史文献拼写问题 |
large language model |
|
|
| 4 |
Decoding Memes: Benchmarking Narrative Role Classification across Multilingual and Multimodal Models |
提出多语言多模态模型以解决互联网表情包叙事角色分类问题 |
multimodal |
|
|
| 5 |
Datasets for Fairness in Language Models: An In-Depth Survey |
提出公平性数据集分析框架以解决语言模型评估问题 |
large language model |
✅ |
|
| 6 |
TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs |
提出TuCo以量化微调对LLM个体响应的贡献 |
large language model |
|
|
| 7 |
Perspective Dial: Measuring Perspective of Text and Guiding LLM Outputs |
提出Perspective Dial以量化文本视角并引导LLM输出 |
large language model |
|
|
| 8 |
ATGen: A Framework for Active Text Generation |
提出ATGen框架以解决自然语言生成中的主动学习问题 |
large language model |
|
|
| 9 |
Information Loss in LLMs' Multilingual Translation: The Role of Training Data, Language Proximity, and Language Family |
研究训练数据与语言特性对多语言翻译信息损失的影响 |
large language model |
|
|
| 10 |
V-SYNTHESIS: Task-Agnostic Synthesis of Consistent and Diverse In-Context Demonstrations from Scratch via V-Entropy |
提出V-Synthesis以解决从零开始合成一致且多样化示例的问题 |
large language model |
|
|
| 11 |
Learning-to-Context Slope: Evaluating In-Context Learning Effectiveness Beyond Performance Illusions |
提出学习上下文斜率以解决ICL评估可靠性问题 |
large language model |
|
|
| 12 |
Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format |
提出Format-Adapter以解决大语言模型推理能力不足的问题 |
large language model |
|
|