论文标题

位置和交互式多模式对话

Situated and Interactive Multimodal Conversations

论文作者

Moon, Seungwhan, Kottur, Satwik, Crook, Paul A., De, Ankita, Poddar, Shivani, Levin, Theodore, Whitney, David, Difranco, Daniel, Beirami, Ahmad, Cho, Eunjoon, Subba, Rajen, Geramifard, Alborz

论文摘要

设想下一代虚拟助手来处理多模式输入(例如,视觉,对以前的互动的记忆,除了用户的话语之外),并执行多峰操作(例如,除了显示系统的说法之外显示路线)。我们将定位的交互式多模式对话(SIMMC)介绍为一个新方向,旨在训练代理,除了对话框历史记录外,还采用了以共进的多模式输入上下文为基础的多模式动作。我们在两个购物域上使用多模式向导(WOZ)设置提供了两个SIMMC数据集(〜169k语言),使用多模式向导(WOZ)设置:(a)家具(接地在共享的虚拟环境中)和(b)时尚(扎根于一组浮游图像)。我们还提供了每个场景中出现的项目的日志,以及上下文的NLU和COREFERCE注释,使用SIMMC对话行为的新颖而统一的框架,用于用户和助手话语。最后,我们将SIMMC中的几个任务作为客观评估协议,例如结构性API预测和响应生成。我们将现有模型集合在这些SIMMC任务上作为强大的基线,并展示丰富的多模式对话交互。我们的数据,注释,代码和模型已公开可用。

Next generation virtual assistants are envisioned to handle multimodal inputs (e.g., vision, memories of previous interactions, in addition to the user's utterances), and perform multimodal actions (e.g., displaying a route in addition to generating the system's utterance). We introduce Situated Interactive MultiModal Conversations (SIMMC) as a new direction aimed at training agents that take multimodal actions grounded in a co-evolving multimodal input context in addition to the dialog history. We provide two SIMMC datasets totalling ~13K human-human dialogs (~169K utterances) using a multimodal Wizard-of-Oz (WoZ) setup, on two shopping domains: (a) furniture (grounded in a shared virtual environment) and, (b) fashion (grounded in an evolving set of images). We also provide logs of the items appearing in each scene, and contextual NLU and coreference annotations, using a novel and unified framework of SIMMC conversational acts for both user and assistant utterances. Finally, we present several tasks within SIMMC as objective evaluation protocols, such as Structural API Prediction and Response Generation. We benchmark a collection of existing models on these SIMMC tasks as strong baselines, and demonstrate rich multimodal conversational interactions. Our data, annotations, code, and models are publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源