| 1 |
A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents |
提出MODEE,结合图学习与LLM文本表示,解决开放域事件抽取中文档级推理难题。 |
large language model multimodal |
|
|
| 2 |
Evaluation of Automatic Speech Recognition Using Generative Large Language Models |
利用生成式大语言模型评估自动语音识别,提升语义相关性。 |
large language model |
|
|
| 3 |
Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms |
提出RedirectQA数据集,研究LLM中实体表面形式对非字面记忆的影响 |
large language model |
|
|
| 4 |
Process Supervision via Verbal Critique Improves Reasoning in Large Language Models |
提出Verbal Process Supervision,通过外部语言反馈提升大语言模型推理能力 |
large language model |
|
|
| 5 |
Unlocking the Power of Large Language Models for Multi-table Entity Matching |
提出LLM4MEM框架,利用大语言模型解决多表实体匹配中的语义不一致和效率问题。 |
large language model |
✅ |
|
| 6 |
EVENT5Ws: A Large Dataset for Open-Domain Event Extraction from Documents |
提出EVENT5Ws:一个用于开放域文档事件抽取的超大型数据集。 |
large language model |
|
|
| 7 |
From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation |
揭示代码生成中被低估的偏见:从条件语句到机器学习流水线 |
large language model |
|
|
| 8 |
Language as a Latent Variable for Reasoning Optimization |
提出polyGRPO,利用多语言作为隐变量优化LLM推理能力,提升跨任务泛化性。 |
chain-of-thought |
|
|
| 9 |
Measuring Opinion Bias and Sycophancy via LLM-based Coercion |
提出llm-bias-bench,通过多轮交互探测LLM在争议话题上的潜在偏见和谄媚行为。 |
large language model |
|
|
| 10 |
Job Skill Extraction via LLM-Centric Multi-Module Framework |
提出SRICL框架,解决LLM在职位技能抽取中边界漂移和幻觉问题 |
large language model |
|
|
| 11 |
OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving |
OptiVerse:构建综合优化问题求解基准,揭示并缓解LLM在复杂优化任务中的建模瓶颈。 |
large language model |
|
|
| 12 |
Reasoning Primitives in Hybrid and Non-Hybrid LLMs |
研究混合与非混合LLM中的推理基元,揭示架构与推理增强对性能的影响 |
large language model |
|
|
| 13 |
CARE: Counselor-Aligned Response Engine for Online Mental-Health Support |
CARE:针对在线心理健康支持的咨询师对齐回复引擎 |
large language model |
|
|
| 14 |
When Bigger Isn't Better: A Comprehensive Fairness Evaluation of Political Bias in Multi-News Summarisation |
针对多文档新闻摘要的政治偏见,提出综合评估框架并探索去偏见干预方法。 |
large language model |
|
|
| 15 |
EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval |
EngramaBench:通过结构化图检索评估长期对话记忆 |
large language model |
|
|
| 16 |
Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model |
提出IRM:一种基于隐式奖励模型的LLM生成文本零样本检测方法 |
large language model |
|
|
| 17 |
On Reasoning Behind Next Occupation Recommendation |
提出基于推理的职业推荐方法,提升LLM在未来职业预测中的性能。 |
large language model |
✅ |
|
| 18 |
Prefix Parsing is Just Parsing |
提出前缀语法转换方法,将前缀解析高效规约到普通解析问题。 |
large language model |
|
|
| 19 |
Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching |
提出轻量级检索增强生成与大语言模型建模框架,用于可扩展的患者-试验匹配。 |
large language model multimodal |
|
|
| 20 |
Shared Lexical Task Representations Explain Behavioral Variability In LLMs |
词汇任务表征解释LLM行为变异性 |
large language model |
|
|
| 21 |
Source-Modality Monitoring in Vision-Language Models |
研究视觉-语言模型中源模态监控能力,揭示语法和语义信号的作用。 |
multimodal |
|
|
| 22 |
When Cow Urine Cures Constipation on YouTube: Limits of LLMs in Detecting Culture-specific Health Misinformation |
揭示LLM在检测文化特定健康虚假信息方面的局限性:以YouTube上的印度牛尿疗法为例 |
large language model |
|
|