| 1 |
Survey on Vision-Language-Action Models |
AI辅助文献综述:探索大型语言模型在视觉-语言-动作模型研究中的应用 |
vision-language-action VLA large language model |
|
|
| 2 |
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation |
综述:大型语言模型赋能科学研究,涵盖AI辅助发现、实验、内容生成与评估 |
large language model multimodal |
|
|
| 3 |
Beyond External Monitors: Enhancing Transparency of Large Language Models for Easier Monitoring |
提出TELLME方法,提升大语言模型透明度,便于监控不当行为 |
large language model chain-of-thought |
|
|
| 4 |
M-IFEval: Multilingual Instruction-Following Evaluation |
提出M-IFEval多语言指令跟随评估基准,扩展LLM评估至法语、日语和西班牙语。 |
large language model instruction following |
|
|
| 5 |
Concept Navigation and Classification via Open-Source Large Language Model Processing |
提出一种基于开源大语言模型的概念导航与分类框架,用于文本数据中潜在结构的检测与分类。 |
large language model |
|
|
| 6 |
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning |
提出ARR框架,通过分析、检索和推理增强大语言模型在问答任务中的性能 |
large language model |
|
|
| 7 |
Evaluating Personality Traits in Large Language Models: Insights from Psychological Questionnaires |
利用心理问卷评估大型语言模型的人格特质,揭示其内在性格差异 |
large language model |
|
|
| 8 |
Probabilistic Subspace Manifolds for Contextual Inference in Large Language Models |
提出基于概率子空间流形的LLM上下文推断方法,提升语义粒度和鲁棒性 |
large language model |
|
|
| 9 |
Probing Internal Representations of Multi-Word Verbs in Large Language Models |
探究大型语言模型中多词动词内部表征,揭示词汇和句法属性的编码方式。 |
large language model |
|
|
| 10 |
S$^2$-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency |
S$^2$-MAD:突破Token限制,提升多智能体辩论效率 |
large language model chain-of-thought |
|
|
| 11 |
SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation |
提出SeDi-Instruct,通过自定向指令生成提升语言模型对齐效果。 |
large language model foundation model |
|
|
| 12 |
Self-Supervised Prompt Optimization |
提出自监督提示优化(SPO)框架,无需外部参考即可提升LLM在各类任务中的性能。 |
large language model |
✅ |
|
| 13 |
Aligning Black-box Language Models with Human Judgments |
提出线性映射框架,对齐黑盒语言模型与人类主观判断 |
large language model |
|
|
| 14 |
LLM-Supported Natural Language to Bash Translation |
提出高质量数据集与评估方法,提升LLM在自然语言到Bash命令翻译中的准确率。 |
large language model |
✅ |
|
| 15 |
Flexible and Efficient Grammar-Constrained Decoding |
提出一种更快语法约束解码算法,加速LLM生成结构化输出。 |
large language model |
|
|
| 16 |
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity? |
提出GSM-Infinite,用于评估LLM在无限增长的上下文长度和推理复杂性下的表现 |
large language model |
|
|
| 17 |
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics |
提出基于BERTScore的知识图谱构建评估框架,关注幻觉和遗漏问题 |
large language model |
|
|
| 18 |
Extracting and Understanding the Superficial Knowledge in Alignment |
提出一种提取和理解对齐模型中浅层知识的方法,用于高效模型对齐和安全恢复。 |
large language model |
|
|
| 19 |
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books |
利用微调LLM和书籍追踪社会偏见随时间演变 |
large language model |
|
|
| 20 |
Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet |
评估大型语言模型标注小型语言模型有害内容的能力,结果表明现有方法尚不成熟。 |
large language model |
|
|
| 21 |
NoLiMa: Long-Context Evaluation Beyond Literal Matching |
NoLiMa:提出超越字面匹配的长文本评估基准,揭示LLM在长程推理中的局限性 |
large language model |
✅ |
|
| 22 |
CodeSCM: Causal Analysis for Multi-Modal Code Generation |
提出CodeSCM,用于多模态代码生成中因果效应分析。 |
large language model |
|
|
| 23 |
An Annotated Reading of 'The Singer of Tales' in the LLM Era |
利用大型语言模型视角解读《故事的歌手》,探索口头诗歌创作理论 |
large language model |
|
|
| 24 |
nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow |
提出nvAgent,通过协同Agent工作流解决复杂NL2Vis任务中多表推理难题。 |
large language model |
|
|
| 25 |
Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition |
提出一种发展可信的工作记忆模型,加速语言模型在关键期的学习效率 |
large language model |
|
|