| 1 |
Memorization in Large Language Models in Medicine: Prevalence, Characteristics, and Implications |
研究表明医学大型语言模型存在显著记忆效应,可能影响其泛化性和安全性。 |
large language model |
|
|
| 2 |
MultimodalHugs: Enabling Sign Language Processing in Hugging Face |
提出MultimodalHugs以解决手语处理的灵活性问题 |
multimodal |
|
|
| 3 |
Acquiescence Bias in Large Language Models |
揭示大语言模型中的“否认偏见”:与人类的从众心理相反 |
large language model |
|
|
| 4 |
DiTTO-LLM: Framework for Discovering Topic-based Technology Opportunities via Large Language Model |
DiTTO-LLM:利用大语言模型发现基于主题的技术机会框架 |
large language model |
|
|
| 5 |
ALIGNS: Unlocking nomological networks in psychological measurement through a large language model |
ALIGNS:利用大型语言模型解锁心理测量中的语义网络,提升效度验证。 |
large language model |
|
|
| 6 |
A Role-Aware Multi-Agent Framework for Financial Education Question Answering with LLMs |
提出基于角色感知的多智能体框架,提升LLM在金融教育问答中的准确性。 |
large language model chain-of-thought |
|
|
| 7 |
Stated Preference for Interaction and Continued Engagement (SPICE): Evaluating an LLM's Willingness to Re-engage in Conversation |
提出SPICE指标,通过意愿调查评估LLM在不同语境下的交互倾向和持续参与度 |
large language model |
|
|
| 8 |
Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings |
提出一种基于心理测量学的文本数据分析方法,利用上下文嵌入揭示文本中的潜在知识维度。 |
large language model |
|
|
| 9 |
Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora |
提出葡萄牙语LLM高质量数据集构建方法,性能媲美工业级语料库 |
large language model |
|
|
| 10 |
Evaluating LLMs Without Oracle Feedback: Agentic Annotation Evaluation Through Unsupervised Consistency Signals |
提出基于一致性信号的Agentic标注评估方法,无需人工反馈评估LLM标注质量。 |
large language model |
|
|
| 11 |
The meaning of prompts and the prompts of meaning: Semiotic reflections and modelling |
基于Peirce符号学理论,将LLM提示工程重构为动态符号互动过程 |
large language model |
|
|
| 12 |
Discrimination by LLMs: Cross-lingual Bias Assessment and Mitigation in Decision-Making and Summarisation |
评估并缓解LLM在决策和摘要任务中的跨语言偏见,关注背景、性别和年龄的影响。 |
large language model |
|
|
| 13 |
Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning |
提出AncientDoc基准测试,评估视觉语言模型在古籍文档理解中的能力。 |
large language model |
|
|
| 14 |
Too Helpful, Too Harmless, Too Honest or Just Right? |
TrinityX:提出一种基于校准专家混合的模块化对齐框架,提升LLM的HHH对齐效果。 |
large language model |
|
|
| 15 |
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs |
研究表明LLM生成的有毒文本在文本解毒任务中表现不如人工数据 |
large language model |
|
|