| 1 |
Role-Playing Agents Driven by Large Language Models: Current Status, Challenges, and Future Trends |
综述性论文:大型语言模型驱动的角色扮演Agent现状、挑战与未来趋势 |
large language model multimodal |
|
|
| 2 |
SIN-Bench: Tracing Native Evidence Chains in Long-Context Multimodal Scientific Interleaved Literature |
提出FITO范式以解决多模态科学文献理解问题 |
large language model multimodal |
|
|
| 3 |
GeoSteer: Faithful Chain-of-Thought Steering via Latent Manifold Gradients |
GeoSteer:通过隐空间流形梯度提升LLM的忠实思维链推理 |
large language model chain-of-thought |
|
|
| 4 |
Contextual StereoSet: Stress-Testing Bias Alignment Robustness in Large Language Models |
提出Contextual StereoSet,用于压力测试大语言模型在不同上下文中的偏见对齐鲁棒性。 |
large language model |
|
|
| 5 |
OctoBench: Benchmarking Scaffold-Aware Instruction Following in Repository-Grounded Agentic Coding |
OctoBench:评估代码仓库环境下的具身智能体对脚手架指令的遵循能力 |
instruction following |
|
|
| 6 |
Detecting Winning Arguments with Large Language Models and Persuasion Strategies |
利用大型语言模型和说服策略检测论辩文本中的胜方 |
large language model |
|
|
| 7 |
Credit C-GPT: A Domain-Specialized Large Language Model for Conversational Understanding in Vietnamese Debt Collection |
提出Credit C-GPT:一个越南语催收场景的领域专用大型语言模型 |
large language model |
|
|
| 8 |
MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts |
MoST:通过模态感知专家混合模型融合语音和文本 |
large language model multimodal |
✅ |
|
| 9 |
Grounding Agent Memory in Contextual Intent |
提出STITCH,通过上下文意图索引记忆,解决长时交互中记忆检索的歧义性问题。 |
large language model |
|
|
| 10 |
Loop as a Bridge: Can Looped Transformers Truly Link Representation Space and Natural Language Outputs? |
研究循环Transformer能否通过迭代提升表征空间与自然语言输出的关联性 |
large language model |
|
|
| 11 |
AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers |
AWED-FiNER:为66亿用户提供36种语言的细粒度命名实体识别 |
large language model |
✅ |
|
| 12 |
DR-Arena: an Automated Evaluation Framework for Deep Research Agents |
DR-Arena:提出一个全自动的深度研究Agent评估框架,解决现有基准测试的局限性。 |
large language model |
|
|
| 13 |
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models |
提出助手轴概念,稳定大型语言模型默认人格并抑制有害行为。 |
large language model |
|
|
| 14 |
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text |
提出GEM:一种从文本合成工具使用轨迹的方法,提升LLM多轮交互能力。 |
large language model |
|
|
| 15 |
The Straight and Narrow: Do LLMs Possess an Internal Moral Path? |
利用道德基础理论,通过干预LLM内部道德表征提升其道德对齐性 |
large language model |
|
|
| 16 |
HUMANLLM: Benchmarking and Reinforcing LLM Anthropomorphism via Human Cognitive Patterns |
HUMANLLM:通过人类认知模式基准测试并强化LLM的拟人化能力 |
large language model |
|
|
| 17 |
CALM-IT: Generating Realistic Long-Form Motivational Interviewing Dialogues with Dual-Actor Conversational Dynamics Tracking |
CALM-IT:通过双角色会话动态跟踪生成逼真的长程动机访谈对话 |
large language model |
|
|