| 1 |
Aware First, Think Less: Dynamic Boundary Self-Awareness Drives Extreme Reasoning Efficiency in Large Language Models |
提出动态边界自我意识框架以提升大语言模型推理效率 |
large language model chain-of-thought |
|
|
| 2 |
Speciesism in AI: Evaluating Discrimination Against Animals in Large Language Models |
评估大型语言模型中的物种歧视问题 |
large language model |
|
|
| 3 |
AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries |
研究大型语言模型在心理健康领域的情感分析 |
large language model |
|
|
| 4 |
LETToT: Label-Free Evaluation of Large Language Models On Tourism Using Expert Tree-of-Thought |
提出LETToT框架以解决旅游领域LLM评估问题 |
large language model |
|
|
| 5 |
Hallucination Detection and Mitigation in Scientific Text Simplification using Ensemble Approaches: DS@GT at CLEF 2025 SimpleText |
提出集成方法以检测和缓解科学文本简化中的幻觉问题 |
large language model |
|
|
| 6 |
LLM-Guided Planning and Summary-Based Scientific Text Simplification: DS@GT at CLEF 2025 SimpleText |
提出基于LLM的科学文本简化方法以解决复杂性问题 |
large language model |
|
|
| 7 |
A Multi-Task Evaluation of LLMs' Processing of Academic Text Input |
评估大型语言模型在学术文本处理中的多任务能力 |
large language model |
|
|
| 8 |
Can we Evaluate RAGs with Synthetic Data? |
探讨合成数据在RAG评估中的有效性 |
large language model |
|
|
| 9 |
Online Anti-sexist Speech: Identifying Resistance to Gender Bias in Political Discourse |
提出在线反性别言论识别方法以应对政治话语中的性别偏见问题 |
large language model |
|
|
| 10 |
Survey-to-Behavior: Downstream Alignment of Human Values in LLMs via Survey Questions |
通过问卷调查调整大型语言模型的人类价值观 |
large language model |
|
|
| 11 |
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs |
提出五种方法以提升大语言模型的提示鲁棒性 |
large language model |
✅ |
|
| 12 |
Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning |
提出基于Llama的反馈指标提取方法以优化语言学习反馈 |
large language model |
|
|
| 13 |
SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis |
提出SpecDetect以解决LLM生成文本检测问题 |
large language model |
|
|
| 14 |
SGSimEval: A Comprehensive Multifaceted and Similarity-Enhanced Benchmark for Automatic Survey Generation Systems |
提出SGSimEval以解决自动调查生成系统评估不足的问题 |
large language model |
|
|
| 15 |
ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection |
提出ToxiFrench以解决法语毒性检测问题 |
chain-of-thought |
|
|
| 16 |
UNVEILING: What Makes Linguistics Olympiad Puzzles Tricky for LLMs? |
揭示语言学奥林匹克难题对大型语言模型的挑战 |
large language model |
|
|
| 17 |
E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection |
提出E-CaTCH以解决社交媒体上的多模态虚假信息检测问题 |
multimodal |
|
|