| 1 |
Evaluating Large Language Models on Multimodal Chemistry Olympiad Exams |
评估大型语言模型在多模态化学奥林匹克竞赛题上的表现 |
large language model multimodal visual grounding |
|
|
| 2 |
SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification |
SGM:通过神经元级解毒为多模态大语言模型提供安全保障 |
large language model multimodal |
|
|
| 3 |
MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers |
提出MCP-SafetyBench,用于评估大语言模型在真实MCP服务器环境中的安全性 |
large language model |
|
|
| 4 |
Dual-Density Inference for Efficient Language Model Reasoning |
提出Denser双密度推理框架,提升LLM复杂推理任务的计算效率。 |
large language model chain-of-thought |
|
|
| 5 |
BRAID: Bounded Reasoning for Autonomous Inference and Decisions |
BRAID:一种用于自主推理和决策的有界推理框架,提升LLM的效率和准确性 |
large language model |
|
|
| 6 |
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers |
提出Activation Oracles,通过训练LLM解释激活值,实现通用激活解释器。 |
large language model |
|
|
| 7 |
How Much is Too Much? Exploring LoRA Rank Trade-offs for Retaining Knowledge and Domain Robustness |
研究LoRA秩对知识保留和领域泛化能力的权衡,为下游任务提供参数高效微调策略。 |
large language model |
|
|
| 8 |
Evaluating Metrics for Safety with LLM-as-Judges |
提出基于LLM-as-Judges的加权指标评估方法,提升LLM在安全关键任务中的可靠性。 |
large language model |
|
|
| 9 |
CTkvr: KV Cache Retrieval for Long-Context LLMs via Centroid then Token Indexing |
提出CTKVR:一种基于质心和Token索引的长文本LLM KV缓存检索方法 |
large language model |
|
|
| 10 |
Toward expert-level motivational interviewing for health behavior improvement with LLMs |
利用大型语言模型实现专家级动机访谈,促进健康行为改善 |
large language model |
|
|
| 11 |
Evaluating LLMs for Zeolite Synthesis Event Extraction (ZSEE): A Systematic Analysis of Prompting Strategies |
系统评估LLM在沸石合成事件抽取(ZSEE)中的提示策略有效性 |
large language model |
|
|
| 12 |
Towards Proactive Personalization through Profile Customization for Individual Users in Dialogues |
提出PersonalAgent,通过用户画像定制实现对话系统中的主动个性化 |
large language model |
|
|
| 13 |
The Moralization Corpus: Frame-Based Annotation and Analysis of Moralizing Speech Acts across Diverse Text Genres |
提出道德化语料库以分析多样文本中的道德化言论 |
large language model |
|
|
| 14 |
Yes-MT's Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024 |
Yes-MT团队探索多种方法,解决WMT 2024低资源印度语言翻译难题。 |
large language model |
|
|
| 15 |
RFKG-CoT: Relation-Driven Adaptive Hop-count Selection and Few-Shot Path Guidance for Knowledge-Aware QA |
提出RFKG-CoT以解决知识密集型问答中的幻觉问题 |
large language model |
|
|
| 16 |
The Meta-Prompting Protocol: Orchestrating LLMs via Adversarial Feedback Loops |
提出Meta-Prompting协议,通过对抗反馈循环优化LLM,提升可靠性。 |
large language model |
|
|