| 1 |
MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks |
MateInfoUB:用于评估LLM在多语言多模态竞赛教育任务中的真实世界基准 |
large language model multimodal |
|
|
| 2 |
SI-Agent: An Agentic Framework for Feedback-Driven Generation and Tuning of Human-Readable System Instructions for Large Language Models |
SI-Agent:一种基于反馈驱动的Agent框架,用于生成和优化LLM的人类可读系统指令 |
large language model |
|
|
| 3 |
Self-DANA: A Resource-Efficient Channel-Adaptive Self-Supervised Approach for ECG Foundation Models |
提出Self-DANA,一种资源高效的通道自适应自监督心电图基础模型方法 |
foundation model |
|
|
| 4 |
Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory |
利用演化博弈论评估大型语言模型的战略智能水平 |
large language model |
|
|
| 5 |
Autonomous Control Leveraging LLMs: An Agentic Framework for Next-Generation Industrial Automation |
提出基于LLM的Agent框架,用于下一代工业自动化中的离散规划与连续控制统一。 |
large language model instruction following |
|
|
| 6 |
Symbiosis: Multi-Adapter Inference and Fine-Tuning |
Symbiosis:多适配器推理与微调框架,解决资源管理和隐私问题 |
large language model |
|
|
| 7 |
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving |
提出基于自生成目标条件MDP的Bourbaki框架,用于提升定理证明能力 |
large language model |
|
|
| 8 |
Content filtering methods for music recommendation: A review |
综述:面向音乐推荐的内容过滤方法,解决交互稀疏性问题 |
large language model |
|
|
| 9 |
The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Literature Review |
系统性文献综述揭示LLM助手对软件开发者生产力的影响:收益与风险并存 |
large language model |
|
|
| 10 |
LLM-Driven Auto Configuration for Transient IoT Device Collaboration |
CollabIoT:利用LLM驱动的自动配置实现瞬态IoT设备的安全协作 |
large language model |
|
|
| 11 |
Moral Responsibility or Obedience: What Do We Want from AI? |
重新审视AI安全:从服从到道德推理的范式转变 |
large language model |
|
|
| 12 |
Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work |
提出知识协议工程(KPE),赋能LLM在特定领域知识工作中进行深度推理。 |
large language model |
|
|
| 13 |
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks |
Meta SecAlign:首个完全开源的、防御提示注入攻击的安全LLM |
instruction following |
✅ |
|
| 14 |
LLMs and Fuzzing in Tandem: A New Approach to Automatically Generating Weakest Preconditions |
提出结合LLM与模糊测试的Fuzzing Guidance方法,自动生成最弱前置条件 |
large language model |
|
|
| 15 |
Hey AI, Generate Me a Hardware Code! Agentic AI-based Hardware Design & Verification |
提出基于Agentic AI的硬件设计与验证方法,提升效率和覆盖率。 |
large language model |
|
|
| 16 |
FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference |
FlowSpec:面向高效分布式LLM推理的连续流水线式推测解码框架 |
large language model |
✅ |
|
| 17 |
Improving LLM Reasoning for Vulnerability Detection via Group Relative Policy Optimization |
提出基于GRPO的LLM微调方法,提升软件漏洞检测的推理能力 |
large language model |
|
|
| 18 |
Clarifying Before Reasoning: A Coq Prover with Structural Context |
提出基于结构化上下文的Coq定理证明器,显著提升LLM推理能力 |
large language model |
|
|