| 1 |
AMix-2: Establishing Protein as a Native Modality in Large Language Models |
AMix-2:构建蛋白质原生模态的大语言模型,统一蛋白质理解与设计 |
large language model foundation model |
|
|
| 2 |
Differentially Private Preference Data Synthesis for Large Language Model Alignment |
提出DPPrefSyn算法,用于生成差分隐私偏好数据,以对齐大语言模型。 |
large language model |
✅ |
|
| 3 |
FAM-Bench: A Multimodal Benchmark for Condition-Aware Food-as-Medicine Reasoning |
提出FAM-Bench多模态基准,用于评估模型在特定健康状况下的膳食推荐能力 |
multimodal |
|
|
| 4 |
ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment |
ImmersiveTTS:提出环境感知的TTS模型,通过多模态扩散Transformer实现沉浸式语音生成。 |
multimodal |
|
|
| 5 |
BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs |
BilliardPhys-Bench:多模态LLM物理推理与视觉动力学评测基准 |
multimodal |
|
|
| 6 |
Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration |
提出SCALE框架,通过认知探索提升Web Agent在动态环境中的自适应能力 |
large language model multimodal |
|
|
| 7 |
LinTree: Improving LLM Reasoning with Explicitly Structured Search Histories |
LinTree:通过显式结构化搜索历史提升LLM推理能力 |
large language model |
|
|
| 8 |
Neither Replacement nor Panacea: Comparing LLM-Based Conversational and Graphical Decision Support in Industrial Tasks |
对比LLM对话式与图形化决策支持在工业任务中的应用,发现其各有优劣。 |
large language model |
|
|
| 9 |
LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability |
LLM-FACETS:一个保护隐私的LLM透明性和问责性评估框架 |
large language model |
|
|
| 10 |
Developing a UXR Point of View for Cognitive Accessibility in Mobile Learning with Generative AI |
利用生成式AI,为移动学习中认知可访问性开发UXR视角 |
large language model |
|
|
| 11 |
SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition |
SpecDB:利用LLM和面向特征分解生成定制化数据库 |
large language model |
|
|
| 12 |
GraphARC: A Comprehensive Benchmark for Graph-Based Abstract Reasoning |
提出GraphARC:一个用于图结构抽象推理的综合基准测试。 |
foundation model |
|
|
| 13 |
TUX: Measuring Human--AI Tacit Understanding |
提出TUX指标,衡量人与AI在无明确目标下的隐性理解能力 |
large language model |
|
|
| 14 |
A Unified and Reproducible Experimentation Framework for Speech Understanding |
SURE:统一且可复现的语音理解实验框架,提升模型选型效率。 |
foundation model |
|
|
| 15 |
UniScale: Adaptive Unified Inference Scaling via Online Joint Optimization of Model Routing and Test-Time Scaling |
提出UniScale,通过在线联合优化模型路由和测试时缩放,自适应地统一推理加速。 |
large language model |
|
|
| 16 |
MAVEN: Improving Generalization in Agentic Tool Calling |
MAVEN:通过模块化验证执行网络提升Agentic工具调用中的泛化能力 |
large language model |
|
|