| 1 |
EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models |
提出EvalMORAAL框架以评估大型语言模型的道德一致性 |
large language model chain-of-thought |
|
|
| 2 |
MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation |
提出MMA-ASIA框架,用于多语言多模态文化背景下的大语言模型评测 |
large language model multimodal |
|
|
| 3 |
Prototype-Based Dynamic Steering for Large Language Models |
提出基于原型动态引导(PDS)方法,无需指令即可增强LLM的推理能力 |
large language model chain-of-thought |
|
|
| 4 |
DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization |
提出DACP:领域自适应持续预训练LLM,提升电话对话摘要效果 |
large language model |
|
|
| 5 |
LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback |
LatentBreak:通过隐空间反馈绕过大语言模型的安全机制 |
large language model |
|
|
| 6 |
H1B-KV: Hybrid One-Bit Caches for Memory-Efficient Large Language Model Inference |
H1B-KV:提出混合一位缓存,解决LLM长文本推理中的内存瓶颈问题。 |
large language model |
|
|
| 7 |
EverydayMMQA: A Multilingual and Multimodal Framework for Culturally Grounded Spoken Visual QA |
提出EverydayMMQA框架与OASIS数据集,解决多语言多模态VQA中文化常识不足问题 |
multimodal |
|
|
| 8 |
CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credits |
提出CreditDecoding以加速扩散大语言模型的并行解码 |
large language model |
|
|
| 9 |
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models |
提出分布语义追踪框架,解释大语言模型中的幻觉现象 |
large language model |
|
|
| 10 |
Centering Emotion Hotspots: Multimodal Local-Global Fusion and Cross-Modal Alignment for Emotion Recognition in Conversations |
提出基于情感热点的多模态局部-全局融合与跨模态对齐的对话情感识别模型 |
multimodal |
|
|
| 11 |
Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models |
提出代码切换上下文学习(CSICL),提升大语言模型跨语言迁移能力。 |
large language model |
|
|
| 12 |
GraphGhost: Tracing Structures Behind Large Language Models |
GraphGhost:通过图结构追踪大语言模型内部推理机制 |
large language model |
|
|
| 13 |
MADIAVE: Multi-Agent Debate for Implicit Attribute Value Extraction |
提出MADIAVE,利用多智能体辩论框架解决电商中隐式属性值抽取难题 |
large language model multimodal |
|
|
| 14 |
Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks |
提出一种多语言多Agent LLM框架,用于缓解对抗性虚假信息攻击。 |
large language model |
|
|
| 15 |
Mnemosyne: An Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs |
Mnemosyne:面向边缘LLM的、受人类启发的无监督长期记忆架构 |
large language model |
|
|
| 16 |
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM |
提出PGSVD,通过激活感知的Pareto引导低秩压缩高效压缩LLM/VLM |
large language model |
|
|
| 17 |
RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback |
RECODE-H:一个通过人机交互反馈改进科研代码生成的基准。 |
large language model |
|
|
| 18 |
Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability |
提出Spectrum Tuning,提升语言模型在上下文引导下的分布覆盖能力 |
instruction following |
|
|
| 19 |
LexiCon: a Benchmark for Planning under Temporal Constraints in Natural Language |
LexiCon:一个用于评估LLM在自然语言时序约束下规划能力的基准 |
large language model |
|
|
| 20 |
Instructional Goal-Aligned Question Generation for Student Evaluation in Virtual Lab Settings: How Closely Do LLMs Actually Align? |
提出教学目标对齐的问题生成框架,利用LLM辅助虚拟实验室学生评估。 |
large language model |
|
|
| 21 |
Semantic Regexes: Auto-Interpreting LLM Features with a Structured Language |
提出语义正则表达式以解决LLM特征自动解释问题 |
large language model |
|
|
| 22 |
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences |
EVALUESTEER:衡量奖励模型在价值观和偏好方面的可操纵性 |
large language model |
|
|
| 23 |
VecInfer: Efficient LLM Inference with Low-Bit KV Cache via Outlier-Suppressed Vector Quantization |
VecInfer:通过抑制异常值的向量量化实现低比特KV缓存的高效LLM推理 |
large language model |
|
|
| 24 |
Evaluating The Impact of Stimulus Quality in Investigations of LLM Language Performance |
通过优化刺激质量,提升LLM在句法预测任务中的性能评估 |
large language model |
|
|
| 25 |
MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation |
MASA:通过多A共享适配重塑LoRA中的表征瓶颈 |
large language model |
|
|
| 26 |
Exploring Gaps in the APS: Direct Minimal Pair Analysis in LLM Syntactic Assessments |
通过直接最小对分析揭示LLM句法评估中的差距,提升评估透明度。 |
large language model |
|
|
| 27 |
Hire Your Anthropologist! Rethinking Culture Benchmarks Through an Anthropological Lens |
通过人类学视角重塑文化基准评测,提升大语言模型文化理解能力 |
large language model |
|
|
| 28 |
The fragility of "cultural tendencies" in LLMs |
通过更广泛的实验验证,揭示大语言模型中“文化倾向”的脆弱性 |
large language model |
|
|
| 29 |
Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer |
Luth:面向法语的小型语言模型高效特化与跨语言迁移 |
large language model |
|
|
| 30 |
Adaptive and Multi-Source Entity Matching for Name Standardization of Astronomical Observation Facilities |
提出一种自适应多源实体匹配方法,用于天文观测设施名称标准化。 |
large language model |
|
|
| 31 |
KEO: Knowledge Extraction on OMIn via Knowledge Graphs and RAG for Safety-Critical Aviation Maintenance |
KEO:面向航空维护,利用知识图谱与RAG进行OMIn知识抽取 |
large language model |
|
|