| 1 |
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences? |
揭示大语言模型偏好偏差:一致性评估与可信赖性分析 |
large language model |
|
|
| 2 |
FinBERT2: A Specialized Bidirectional Encoder for Bridging the Gap in Finance-Specific Deployment of Large Language Models |
FinBERT2:面向金融领域LLM部署的专用双向编码器,提升判别与检索性能 |
large language model |
|
|
| 3 |
PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective Decoding |
提出PMF-CEC,利用音素增强多模态融合,提升上下文感知ASR纠错中同音异形词的准确率。 |
multimodal |
|
|
| 4 |
CMT-LLM: Contextual Multi-Talker ASR Utilizing Large Language Models |
CMT-LLM:融合上下文偏置的多说话人语音识别,利用大语言模型提升性能 |
large language model |
|
|
| 5 |
ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation |
ChartGen:通过代码引导的合成图表生成扩展图表理解能力 |
large language model multimodal |
✅ |
|
| 6 |
Machine vs Machine: Using AI to Tackle Generative AI Threats in Assessment |
提出一种基于机器对抗的AI评估框架,应对生成式AI在教育评估中的威胁 |
large language model multimodal |
|
|
| 7 |
Position: Olfaction Standardization is Essential for the Advancement of Embodied Artificial Intelligence |
呼吁AI领域重视嗅觉标准化,促进具身人工智能发展 |
multimodal |
|
|
| 8 |
CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning |
CodeSense:提出一个真实世界代码语义推理的基准和数据集,用于评估和提升代码大模型在实际软件工程任务中的能力。 |
chain-of-thought |
✅ |
|
| 9 |
RFCAudit: An LLM Agent for Functional Bug Detection in Network Protocols |
RFCAudit:利用LLM Agent检测网络协议中的功能性缺陷 |
large language model |
|
|
| 10 |
Organizational Adaptation to Generative AI in Cybersecurity: A Systematic Review |
网络安全组织通过调整框架和流程适应生成式AI,提升威胁建模和风险应对能力。 |
large language model |
|
|
| 11 |
AgentAuditor: Human-Level Safety and Security Evaluation for LLM Agents |
AgentAuditor:提出一种基于记忆增强推理的LLM Agent安全评估框架,达到人类专家水平。 |
chain-of-thought |
✅ |
|
| 12 |
MIRROR: Modular Internal Processing for Personalized Safety in LLM Dialogue |
MIRROR:模块化内部处理,提升LLM对话中的个性化安全 |
large language model |
|
|
| 13 |
Wide Reflective Equilibrium in LLM Alignment: Bridging Moral Epistemology and AI Safety |
利用广义反思均衡提升LLM对齐,增强伦理基础与动态可修正性 |
large language model |
|
|