cs.CL(2025-03-09)

📊 共 18 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (15 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (3)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
1 Multimodal Programming in Computer Science with Interactive Assistance Powered by Large Language Model 利用大语言模型构建交互式编程辅助系统,提升计算机科学教学效果 large language model multimodal
2 Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators 大型语言模型可辅助人工标注事件,但无法独立完成高质量标注 large language model
3 Delusions of Large Language Models 揭示大语言模型幻觉新形态:高置信度幻觉(Delusion)及其缓解策略 large language model
4 Alignment for Efficient Tool Calling of Large Language Models 提出多目标对齐框架,提升大语言模型工具调用效率,减少不必要调用。 large language model
5 InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models InftyThink:突破大语言模型长文本推理长度限制,实现无限深度推理 large language model
6 Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation 利用大型语言模型作为编码器,提升神经机器翻译的效率与泛化能力 large language model
7 WildIFEval: Instruction Following in the Wild WildIFEval:提出大规模真实用户指令数据集,评估LLM在复杂约束下的指令遵循能力 instruction following
8 Effectiveness of Zero-shot-CoT in Japanese Prompts 比较日英零-shot CoT 提示的有效性 chain-of-thought
9 PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts PFDial:基于UML流程图的结构化对话指令微调方法,提升流程驱动对话系统性能 large language model
10 DependEval: Benchmarking LLMs for Repository Dependency Understanding DependEval:用于评估LLM在代码仓库依赖理解能力的分层基准测试 large language model
11 FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation FEA-Bench:用于评估代码大模型在仓库级别特征实现的代码生成能力基准 large language model
12 Enhancing NLP Robustness and Generalization through LLM-Generated Contrast Sets: A Scalable Framework for Systematic Evaluation and Adversarial Training 利用LLM生成对抗样本集,提升NLP模型的鲁棒性和泛化能力 large language model
13 Evaluating and Aligning Human Economic Risk Preferences in LLMs 评估并对齐LLM中人类经济风险偏好,提升决策合理性 large language model
14 BingoGuard: LLM Content Moderation Tools with Risk Levels BingoGuard:构建具备风险等级评估能力的大语言模型内容审核工具 large language model
15 SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations SafeSpeech:一个用于分析对话中性别歧视和辱骂性语言的综合交互式工具 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
16 Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting 提出Dr Genre:一种解耦LLM反馈的强化学习框架,用于通用文本重写任务 reinforcement learning large language model instruction following
17 GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks 提出GFlowVLM以解决多步推理中的解决方案多样性问题 reinforcement learning PPO chain-of-thought
18 Less is More: Adaptive Program Repair with Bug Localization and Preference Learning 提出AdaPatcher,通过自适应程序修复生成最小修改的补丁 preference learning

⬅️ 返回 cs.CL 首页 · 🏠 返回主页