| 1 |
The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models |
提出道德一致性管道(MoCoP),用于持续评估大型语言模型的伦理道德 |
large language model |
|
|
| 2 |
Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic |
提出Lang2Logic框架,利用微调大语言模型减少逻辑翻译中的幻觉问题 |
large language model |
|
|
| 3 |
BOOM: Beyond Only One Modality KIT's Multimodal Multilingual Lecture Companion |
提出BOOM以解决多模态多语言讲座内容本地化问题 |
multimodal |
✅ |
|
| 4 |
A benchmark dataset for evaluating Syndrome Differentiation and Treatment in large language models |
构建中医领域大型语言模型评测基准TCM-BEST4SDT,用于评估辨证论治能力。 |
large language model |
|
|
| 5 |
Towards Unification of Hallucination Detection and Fact Verification for Large Language Models |
提出UniFact统一框架,弥合LLM幻觉检测与事实验证的研究鸿沟 |
large language model |
✅ |
|
| 6 |
PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models |
PEFT-Factory:统一自回归大语言模型的高效参数微调框架 |
large language model |
✅ |
|
| 7 |
Spoken Conversational Agents with Large Language Models |
语音对话Agent正向语音原生LLM演进,本教程提供系统级路线图。 |
large language model |
|
|
| 8 |
TaleFrame: An Interactive Story Generation System with Fine-Grained Control and Large Language Models |
TaleFrame:结合大语言模型与人机交互的细粒度可控交互式故事生成系统 |
large language model |
✅ |
|
| 9 |
Emergent Bayesian Behaviour and Optimal Cue Combination in LLMs |
提出BayesBench基准测试,评估LLM在多模态感知任务中的贝叶斯行为和最优线索组合能力 |
large language model multimodal |
|
|
| 10 |
Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs |
提出随机掩码微调以解决大语言模型中的隐私泄露问题 |
large language model |
|
|
| 11 |
Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks |
SU S VI B E S基准测试揭示Agent生成代码在真实软件工程任务中存在严重安全漏洞 |
large language model |
|
|
| 12 |
InvertiTune: High-Quality Data Synthesis for Cost-Effective Single-Shot Text-to-Knowledge Graph Generation |
InvertiTune:通过高质量数据合成,实现高性价比的单次文本到知识图谱生成 |
large language model |
|
|
| 13 |
Enhancing Job Matching: Occupation, Skill and Qualification Linking with the ESCO and EQF taxonomies |
利用语言模型增强职位匹配,连接职业、技能与欧洲分类体系 |
large language model |
✅ |
|
| 14 |
Variance-Aware LLM Annotation for Strategy Research: Sources, Diagnostics, and a Protocol for Reliable Measurement |
提出方差感知LLM标注协议,提升策略研究中文本标注的可靠性与可复现性 |
large language model |
|
|
| 15 |
Cross-Lingual Prompt Steerability: Towards Accurate and Robust LLM Behavior across Languages |
提出跨语言提示可控性框架,提升LLM在多语言环境下的准确性和鲁棒性 |
large language model |
|
|
| 16 |
promptolution: A Unified, Modular Framework for Prompt Optimization |
提出promptolution,一个统一模块化的Prompt优化框架,提升大语言模型性能。 |
large language model |
|
|
| 17 |
Noise-Driven Persona Formation in Reflexive Neural Language Generation |
提出Luca-Noise反射协议,研究噪声驱动的大语言模型人格涌现 |
large language model |
|
|
| 18 |
CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer |
CREST:通过聚类引导的跨语言迁移实现通用安全防护 |
large language model |
|
|
| 19 |
An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation |
模型融合算法用于缓解社会偏见:一项针对LLM的实证研究 |
large language model |
|
|
| 20 |
Input Order Shapes LLM Semantic Alignment in Multi-Document Summarization |
多文档摘要中输入顺序影响LLM的语义对齐,首篇文档具有显著优先效应 |
large language model |
|
|
| 21 |
LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems |
提出LeechHijack攻击,揭示智能体系统中第三方工具的隐式资源劫持风险。 |
large language model |
|
|
| 22 |
When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers |
研究LLM作为解决方案验证器的有效性,揭示跨模型验证的优势与后训练的影响。 |
large language model |
|
|