| 1 |
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark |
提出MMMT-IF基准测试,用于评估多模态多轮对话中指令遵循能力。 |
multimodal instruction following |
|
|
| 2 |
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective |
揭示LLM代码生成中组合任务的内在困难,提出多智能体分解策略 |
large language model chain-of-thought |
|
|
| 3 |
Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy |
提出AMMORE数据集,并利用CoT提示提升LLM在数学形成性评估中边缘案例的评分准确率 |
large language model chain-of-thought |
|
|
| 4 |
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey |
综述有害微调攻击与防御,应对大语言模型安全风险 |
large language model |
✅ |
|
| 5 |
A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios |
提出SiBraR单分支嵌入网络,解决推荐系统中冷启动和模态缺失问题。 |
multimodal |
|
|
| 6 |
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models |
MaskLLM:面向大语言模型的可学习半结构化稀疏方法 |
large language model |
✅ |
|
| 7 |
Development and Validation of a Large Language Model for Generating Fully-Structured Radiology Reports |
提出动态模板约束解码的LLM,用于生成高质量、结构化的肺癌筛查报告。 |
large language model |
|
|
| 8 |
Trustworthy AI: Securing Sensitive Data in Large Language Models |
提出面向大语言模型的信任框架,保障敏感数据安全 |
large language model |
|
|
| 9 |
A Scalable Data-Driven Framework for Systematic Analysis of SEC 10-K Filings Using Large Language Models |
提出一种可扩展的数据驱动框架,利用大型语言模型系统分析SEC 10-K文件。 |
large language model |
|
|
| 10 |
Infer Human's Intentions Before Following Natural Language Instructions |
提出FISER框架,通过社交推理预测人类意图,提升具身协作任务中的指令跟随性能。 |
instruction following chain-of-thought |
|
|
| 11 |
A Survey of Spatio-Temporal EEG data Analysis: from Models to Applications |
综述时空脑电图数据分析方法及其应用 |
large language model foundation model |
✅ |
|
| 12 |
Policy Maps: Tools for Guiding the Unbounded Space of LLM Behaviors |
提出Policy Maps,引导LLM行为空间,辅助AI策略设计。 |
large language model |
|
|
| 13 |
Data-Prep-Kit: getting your data ready for LLM application development |
提出Data Prep Kit (DPK),用于大规模语言模型应用开发的数据准备。 |
large language model |
|
|
| 14 |
MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks |
提出MoJE:一种基于专家混合和朴素表格分类器的LLM越狱攻击防御方法 |
large language model |
|
|
| 15 |
Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI |
评估LLM认知偏差:揭示GPT-4o、Gemma 2和Llama 3.1的决策缺陷 |
large language model |
|
|
| 16 |
The Nexus of AR/VR, AI, UI/UX, and Robotics Technologies in Enhancing Learning and Social Interaction for Children with Autism Spectrum Disorders: A Systematic Review |
系统综述:AR/VR、AI、UI/UX与机器人技术融合,提升自闭症儿童的学习与社交互动 |
large language model |
|
|
| 17 |
AI Delegates with a Dual Focus: Ensuring Privacy and Strategic Self-Disclosure |
提出双重关注的AI代理,保障隐私与策略性自我披露,用于社交互动。 |
large language model |
|
|
| 18 |
Dr. GPT in Campus Counseling: Understanding Higher Education Students' Opinions on LLM-assisted Mental Health Services |
探索LLM在校园心理咨询中的应用:理解大学生对AI辅助心理健康服务的观点 |
large language model |
|
|
| 19 |
Multi-Designated Detector Watermarking for Language Models |
提出多指定检测器水印(MDDW)技术,用于保护大型语言模型的知识产权。 |
large language model |
|
|
| 20 |
From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection |
提出融合事件分析与LLM的时间序列预测方法,提升预测精度。 |
large language model |
|
|
| 21 |
Human Mobility Modeling with Household Coordination Activities under Limited Information via Retrieval-Augmented LLMs |
提出检索增强LLM框架,利用有限信息建模包含家庭协同的人类出行模式 |
large language model |
|
|