| 1 |
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models |
提出MUCAR以解决多模态语言模型中的模糊性问题 |
large language model multimodal |
|
|
| 2 |
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation |
提出链式思维提示以解决大型语言模型的幻觉检测问题 |
large language model chain-of-thought |
✅ |
|
| 3 |
When Does Multimodality Lead to Better Time Series Forecasting? |
系统研究多模态在时间序列预测中的有效性与条件 |
large language model foundation model multimodal |
|
|
| 4 |
From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models |
提出新框架分析推理语言模型的思维与输出特征 |
large language model chain-of-thought |
✅ |
|
| 5 |
Computational Approaches to Understanding Large Language Model Impact on Writing and Information Ecosystems |
探讨大型语言模型对写作与信息生态系统的影响 |
large language model |
|
|
| 6 |
Large Language Models as symbolic DNA of cultural dynamics |
提出将大型语言模型视为文化动态的象征性DNA |
large language model |
|
|
| 7 |
MIST: Jailbreaking Black-box Large Language Models via Iterative Semantic Tuning |
提出MIST以解决黑箱大语言模型的越狱问题 |
large language model |
|
|
| 8 |
Towards Safety Evaluations of Theory of Mind in Large Language Models |
提出理论心智评估方法以提升大型语言模型的安全性 |
large language model |
|
|
| 9 |
LegiGPT: Party Politics and Transport Policy with Large Language Model |
提出LegiGPT框架以分析交通政策中的党派政治影响 |
large language model |
|
|
| 10 |
Cross-Modal Obfuscation for Jailbreak Attacks on Large Vision-Language Models |
提出跨模态对抗模糊化方法以解决大型视觉语言模型的越狱攻击问题 |
multimodal |
|
|
| 11 |
TeXpert: A Multi-Level Benchmark for Evaluating LaTeX Code Generation by LLMs |
提出TeXpert基准以评估LLMs在LaTeX代码生成中的表现 |
large language model |
✅ |
|
| 12 |
Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly |
提出语言信息驱动的理性代理模型合成框架以解决社会推理问题 |
multimodal |
|
|
| 13 |
VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM |
提出VeriLocc以解决GPU架构间寄存器分配问题 |
large language model |
|
|
| 14 |
Beyond the Link: Assessing LLMs' ability to Classify Political Content across Global Media |
评估大型语言模型在全球媒体政治内容分类中的能力 |
large language model |
|
|
| 15 |
Leveraging LLMs to Assess Tutor Moves in Real-Life Dialogues: A Feasibility Study |
利用大型语言模型评估真实对话中的辅导行为 |
large language model |
|
|
| 16 |
Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency |
探讨微调对大型语言模型安全性的影响及评估一致性问题 |
large language model |
|
|
| 17 |
LLM-Generated Feedback Supports Learning If Learners Choose to Use It |
研究LLM生成反馈对学习的影响及其应用潜力 |
large language model |
|
|
| 18 |
PersonalAI: A Systematic Comparison of Knowledge Graph Storage and Retrieval Approaches for Personalized LLM agents |
提出基于知识图谱的外部记忆框架以解决个性化LLM代理的存储与检索问题 |
large language model |
|
|
| 19 |
Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations |
提出机制与结果框架以探讨语言模型的句法表现 |
large language model |
|
|