| 1 |
Dynamic Adaptation of LoRA Fine-Tuning for Efficient and Task-Specific Optimization of Large Language Models |
提出动态LoRA,通过动态权重分配和输入特征自适应,高效优化特定任务的大语言模型。 |
large language model multimodal |
|
|
| 2 |
Mitigating GenAI-powered Evidence Pollution for Out-of-Context Multimodal Misinformation Detection |
提出跨模态证据重排序与推理,缓解GenAI污染证据对上下文失实多模态信息检测的影响 |
multimodal |
|
|
| 3 |
Self-reflecting Large Language Models: A Hegelian Dialectical Approach |
提出基于黑格尔辩证法的自反思LLM,提升科学创意生成与推理能力 |
large language model |
|
|
| 4 |
FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing |
FlexiGPT:通过低秩权重共享剪枝和扩展大型语言模型 |
large language model |
|
|
| 5 |
JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models |
JustLogic:一个用于评估大语言模型演绎推理能力的综合基准 |
large language model |
✅ |
|
| 6 |
CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models |
提出CASE-Bench:一个上下文感知的LLM安全基准评测框架 |
large language model |
|
|
| 7 |
Examining Alignment of Large Language Models through Representative Heuristics: The Case of Political Stereotypes |
通过代表性启发法检验大语言模型的对齐性:以政治刻板印象为例 |
large language model |
|
|
| 8 |
Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game |
提出多智能体KTO,通过语言游戏强化大语言模型的策略交互能力 |
large language model |
|
|
| 9 |
Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion |
提出DEDC框架,用于评估大语言模型在自然语言到形式语言转换中的分解与组合能力。 |
large language model |
|
|
| 10 |
Evaluating and Improving Graph to Text Generation with Large Language Models |
评估并改进大型语言模型在图到文本生成任务中的表现,提出PlanGTG数据集。 |
large language model |
✅ |
|
| 11 |
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques |
提出RealCritic以评估语言模型的批评能力 |
large language model chain-of-thought |
✅ |
|
| 12 |
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation |
提出AoPS-Instruct数据集和LiveAoPSBench,提升LLM奥数问题求解能力并进行抗污染评估。 |
large language model TAMP |
✅ |
|
| 13 |
Tuning LLM Judge Design Decisions for 1/1000 of the Cost |
通过低成本超参数调优,实现高性价比的大语言模型评判器设计 |
large language model |
✅ |
|
| 14 |
ExPerT: Effective and Explainable Evaluation of Personalized Long-Form Text Generation |
ExPerT:一种有效且可解释的个性化长文本生成评估框架 |
large language model |
|
|
| 15 |
Context-Aware Neural Gradient Mapping for Fine-Grained Instruction Processing |
提出上下文感知神经梯度映射框架,提升LLM在细粒度指令处理中的泛化能力。 |
large language model |
|
|
| 16 |
Do LLMs Provide Consistent Answers to Health-Related Questions across Languages? |
评估大型语言模型在多种语言下对健康问题回答的一致性,揭示潜在的医疗信息不一致风险。 |
large language model |
|
|
| 17 |
Rethinking Table Instruction Tuning |
重新思考表格指令调优:更小的学习率和更少的数据即可提升表格理解能力 |
large language model |
|
|
| 18 |
Unmasking Conversational Bias in AI Multiagent Systems |
提出一种框架,用于量化AI多智能体系统中由对话引发的偏见。 |
large language model |
|
|
| 19 |
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing |
提出CommonWords数据集与可解释神经元编辑方法,缓解LLM中的性别偏见。 |
large language model |
|
|