| 1 |
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models |
提出ARQs,通过领域专家知识引导LLM,显著提升复杂指令跟随能力 |
large language model instruction following chain-of-thought |
|
|
| 2 |
MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving |
MA-LoT:模型协作的长链式思考提升Lean形式化定理证明 |
large language model chain-of-thought |
|
|
| 3 |
The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models |
提出Shortcut-aware算法,解决多模态奖励模型中的单模态虚假相关性问题 |
large language model multimodal |
|
|
| 4 |
Psy-Copilot: Visual Chain of Thought for Counseling |
提出Psy-Copilot,利用视觉化CoT辅助心理咨询,提升LLM可解释性 |
large language model chain-of-thought |
|
|
| 5 |
Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction |
探究大语言模型中的类比推理:概念向量与抽象能力的局限性 |
large language model |
|
|
| 6 |
Open-Source Large Language Models as Multilingual Crowdworkers: Synthesizing Open-Domain Dialogues in Several Languages With No Examples in Targets and No Machine Translation |
利用开源大语言模型作为多语众包工人,零样本合成多语言开放域对话 |
large language model |
|
|
| 7 |
Enhancing Collective Intelligence in Large Language Models Through Emotional Integration |
通过情感集成增强大语言模型的集体智能 |
large language model |
|
|
| 8 |
Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties |
提出PLAT基准,评估大语言模型在额外税收处罚合法性预测中的能力 |
large language model |
|
|
| 9 |
"Only ChatGPT gets me": An Empirical Analysis of GPT versus other Large Language Models for Emotion Detection in Text |
评估大型语言模型在文本情感检测中的能力,重点对比ChatGPT与其他LLM。 |
large language model |
|
|
| 10 |
Performance Comparison of Large Language Models on Advanced Calculus Problems |
对比七种大型语言模型在高等微积分问题上的性能,揭示其优势与不足。 |
large language model |
|
|
| 11 |
Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models |
构建大规模粤语数据集,提升大语言模型在粤语多任务处理上的性能 |
large language model |
|
|
| 12 |
Token-Level Privacy in Large Language Models |
提出dchi-stencil以解决大语言模型中的隐私问题 |
large language model |
|
|
| 13 |
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models |
提出ZOPrO:一种用于大型语言模型偏好优化的零阶优化算法。 |
large language model |
✅ |
|
| 14 |
iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News |
iNews:一个用于建模个性化情感反应的大规模多模态新闻数据集 |
multimodal |
|
|
| 15 |
DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models |
提出DSVD:一种动态自验证解码框架,用于提升大语言模型生成内容的可靠性。 |
large language model |
|
|
| 16 |
PowerAttention: Exponentially Scaling of Receptive Fields for Effective Sparse Attention |
提出PowerAttention以解决长上下文处理中的稀疏注意力问题 |
large language model |
|
|
| 17 |
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation |
提出Monitoring Decoding框架,通过动态监测和干预生成过程,缓解大语言模型中的幻觉问题。 |
large language model |
|
|
| 18 |
Process-based Self-Rewarding Language Models |
提出基于过程的自奖励语言模型,提升数学推理能力。 |
large language model |
|
|
| 19 |
Replicating Human Social Perception in Generative AI: Evaluating the Valence-Dominance Model |
评估Valence-Dominance模型,研究生成AI在人类社会感知上的复现能力 |
multimodal |
|
|
| 20 |
Framing the Game: How Context Shapes LLM Decision-Making |
提出情境框架评估方法,揭示LLM决策对上下文的敏感性 |
large language model |
|
|
| 21 |
Extrapolation Merging: Keep Improving With Extrapolation and Merging |
提出Extrapolation Merging,无需额外计算资源和数据即可持续提升LLM性能。 |
large language model |
|
|
| 22 |
Geometry-Guided Adversarial Prompt Detection via Curvature and Local Intrinsic Dimension |
提出CurvaLID,利用几何特性高效检测大语言模型中的对抗性提示 |
large language model |
|
|
| 23 |
RASD: Retrieval-Augmented Speculative Decoding |
提出RASD:检索增强的推测解码加速LLM推理,提升领域外泛化性。 |
large language model |
|
|
| 24 |
Cite Before You Speak: Enhancing Context-Response Grounding in E-commerce Conversational LLM-Agents |
提出一种电商对话LLM-Agent的引用生成方法,提升事实 grounding 和用户信任度。 |
large language model |
|
|
| 25 |
Towards Robust Universal Information Extraction: Benchmark, Evaluation, and Solution |
提出RUIE-Bench以解决通用信息提取的鲁棒性问题 |
large language model |
|
|
| 26 |
Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling |
构建Psy-Insight:一个面向心理健康咨询的可解释多轮双语数据集 |
large language model |
|
|
| 27 |
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders |
利用稀疏自编码器提取特征,提升人工智能文本检测的可解释性 |
large language model |
|
|
| 28 |
EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States |
提出EnigmaToM框架,利用神经知识库提升LLM的心智理论推理能力 |
large language model |
|
|
| 29 |
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection |
提出SEOE框架,通过语义评估和可扩展基准解决开放域事件检测的评估难题。 |
large language model |
|
|