| 1 |
Chain-of-thought Reviewing and Correction for Time Series Question Answering |
提出T3LLM框架,通过显式纠错机制提升时间序列问答的推理能力 |
large language model chain-of-thought |
|
|
| 2 |
Structured Prompting and LLM Ensembling for Multimodal Conversational Aspect-based Sentiment Analysis |
提出结构化提示与LLM集成方法,用于多模态对话场景下的细粒度情感分析。 |
large language model multimodal |
|
|
| 3 |
Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2 |
Llama-3.2宽度剪枝揭示:参数知识退化,指令跟随能力增强 |
instruction following |
|
|
| 4 |
Conformal Prediction Sets for Next-Token Prediction in Large Language Models: Balancing Coverage Guarantees with Set Efficiency |
提出VACP框架,在LLM的Next-Token预测中平衡覆盖率保证与集合效率。 |
large language model |
|
|
| 5 |
Exploring the Vertical-Domain Reasoning Capabilities of Large Language Models |
探索大型语言模型在垂直领域(会计)的推理能力,为企业数字化转型提供基准。 |
large language model |
|
|
| 6 |
Hallucination Detection and Evaluation of Large Language Model |
提出HHEM框架,高效检测大语言模型幻觉,并结合分段检索提升摘要任务性能。 |
large language model |
|
|
| 7 |
Syntactic Framing Fragility: An Audit of Robustness in LLM Ethical Decisions |
提出句法框架脆弱性(SFF)评估框架,揭示LLM在伦理决策中对句法变异的敏感性。 |
large language model chain-of-thought |
|
|
| 8 |
Beg to Differ: Understanding Reasoning-Answer Misalignment Across Languages |
揭示多语言大模型推理与答案错位问题,提出跨语言推理评估框架 |
large language model chain-of-thought |
|
|
| 9 |
Topic Segmentation Using Generative Language Models |
提出基于生成式语言模型的篇章分割方法,利用重叠递归提示策略提升分割效果。 |
large language model |
|
|
| 10 |
SagaScale: A Realistic, Scalable, and High-Quality Long-Context Benchmark Built from Full-Length Novels |
SagaScale:基于完整小说的真实、可扩展、高质量长文本基准 |
large language model |
|
|
| 11 |
Learning When Not to Attend Globally |
提出All-or-Here Attention,使LLM动态决定何时关注全局上下文以提升效率。 |
large language model |
|
|
| 12 |
Mitigating Social Desirability Bias in Random Silicon Sampling |
通过心理学引导提示减少LLM中的社会期望偏差 |
large language model |
|
|