| 1 |
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters |
提出JAMBench和JAM方法,提升大语言模型对抗恶意提示的防御能力 |
large language model |
|
|
| 2 |
How Multilingual Are Large Language Models Fine-Tuned for Translation? |
研究翻译微调对大型语言模型多语言翻译能力的影响 |
large language model |
|
|
| 3 |
Confidence-Aware Sub-Structure Beam Search (CABS): Mitigating Hallucination in Structured Data Generation with Large Language Models |
提出置信度感知子结构束搜索(CABS),缓解LLM在结构化数据生成中的幻觉问题。 |
large language model |
|
|
| 4 |
JoPA:Explaining Large Language Model's Generation via Joint Prompt Attribution |
提出JoPA框架,通过联合提示归因解释大型语言模型生成结果的影响因素。 |
large language model |
|
|
| 5 |
GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning |
提出GNN-RAG框架,结合图神经网络和大型语言模型进行知识图谱问答 |
large language model |
|
|
| 6 |
Is In-Context Learning Sufficient for Instruction Following in LLMs? |
研究表明,上下文学习在指令跟随任务中仍逊于指令微调,并揭示了解码参数的关键作用。 |
instruction following |
✅ |
|
| 7 |
SeamlessExpressiveLM: Speech Language Model for Expressive Speech-to-Speech Translation with Chain-of-Thought |
提出SeamlessExpressiveLM,利用思维链提示实现富有表现力的端到端语音翻译。 |
chain-of-thought |
|
|
| 8 |
ANAH: Analytical Annotation of Hallucinations in Large Language Models |
ANAH:提出用于大语言模型幻觉分析标注的双语数据集 |
large language model |
|
|
| 9 |
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model |
Quest:一种面向查询的数据合成方法,用于扩展大型语言模型的长上下文能力 |
large language model |
|
|
| 10 |
Significance of Chain of Thought in Gender Bias Mitigation for English-Dravidian Machine Translation |
利用思维链(Chain of Thought)缓解英-德拉维机器翻译中的性别偏见 |
chain-of-thought |
|
|
| 11 |
Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach |
提出基于Token概率的轻量级方法,用于检测大语言模型生成内容中的幻觉。 |
large language model |
✅ |
|
| 12 |
TAIA: Large Language Models are Out-of-Distribution Data Learners |
TAIA:利用大语言模型进行领域外数据学习,提升下游任务性能 |
large language model |
✅ |
|
| 13 |
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models |
提出可扩展和可插拔的虚拟Token,用于增强检索式大语言模型,提升效果并保持通用性。 |
large language model |
|
|
| 14 |
PATIENT-Ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals |
PATIENT-Ψ:利用大语言模型模拟患者,用于精神健康专业人员的认知行为疗法训练。 |
large language model |
✅ |
|
| 15 |
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use |
提出检索增强结构化生成框架RASG,解决商业文档信息抽取难题 |
large language model multimodal |
|
|
| 16 |
X-Instruction: Aligning Language Model in Low-resource Languages with Self-curated Cross-lingual Instructions |
提出X-Instruction,通过自策跨语言指令对低资源语言大模型进行对齐。 |
large language model instruction following |
|
|
| 17 |
Jina CLIP: Your CLIP Model Is Also Your Text Retriever |
提出多任务对比学习方法,使CLIP模型同时具备优秀的图文和文本检索能力 |
multimodal |
|
|
| 18 |
SPOT: Text Source Prediction from Originality Score Thresholding |
提出SPOT以解决文本来源预测问题 |
large language model |
|
|
| 19 |
Transfer Q Star: Principled Decoding for LLM Alignment |
提出Transfer Q Star,通过迁移学习进行大语言模型对齐的原则性解码 |
foundation model |
|
|
| 20 |
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning |
提出MCQStudentBert,用于语言学习中学生答案选择预测,提升个性化教学。 |
large language model |
|
|
| 21 |
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers |
通过指令多样化提升代码生成能力:从符号任务到代码生成 |
large language model |
|
|
| 22 |
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths |
SpecDec++通过自适应候选长度提升推测解码效率,加速大语言模型推理。 |
large language model |
✅ |
|
| 23 |
Reasoning about concepts with LLMs: Inconsistencies abound |
揭示LLM概念理解不一致性,提出基于知识图谱的prompt策略提升模型性能 |
large language model |
|
|
| 24 |
SLM as Guardian: Pioneering AI Safety with Small Language Models |
利用小型语言模型作为安全卫士,探索AI安全新途径 |
large language model |
|
|
| 25 |
CharacterGPT: A Persona Reconstruction Framework for Role-Playing Agents |
提出CharacterGPT框架以解决角色扮演代理中的角色一致性问题 |
large language model |
✅ |
|
| 26 |
PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations |
PertEval:通过知识不变扰动揭示大语言模型真实知识能力 |
large language model |
✅ |
|
| 27 |
Automated Focused Feedback Generation for Scientific Writing Assistance |
提出SWIF$^{2}$T,用于自动生成针对科学写作的聚焦反馈,辅助科研新手改进论文。 |
large language model |
|
|
| 28 |
Cutting Through the Noise: Boosting LLM Performance on Math Word Problems |
提出PROBLEMATHIC数据集并微调LLM,提升其在含噪声数学应用题上的鲁棒性 |
large language model |
|
|
| 29 |
Who Writes the Review, Human or AI? |
提出基于迁移学习的方法,用于区分人类撰写和AI生成的书评 |
large language model |
|
|
| 30 |
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization |
提出层级多智能体工作流HMAW,实现零样本提示优化,提升LLM在开放场景下的性能。 |
large language model |
|
|
| 31 |
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation |
提出FunCoder以解决复杂代码生成问题 |
large language model |
|
|
| 32 |
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories |
DevEval:一个与真实代码仓库对齐的手动标注代码生成基准 |
large language model |
|
|