论文标题

视觉食谱流:用于学习用食谱流的对象变化的数据集

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

论文作者

Shirai, Keisuke, Hashimoto, Atsushi, Nishimura, Taichi, Kameko, Hirotaka, Kurita, Shuhei, Ushiku, Yoshitaka, Mori, Shinsuke

论文摘要

我们提出了一个称为Visual配方流的新的多模式数据集,该数据集使我们能够学习每个烹饪动作的结果。数据集由对象状态变化和配方文本的工作流程组成。状态变化表示为图像对,而工作流程表示为食谱流图(R-FG)。图像对接地在R-FG中,该R-FG提供了交叉模式关系。使用我们的数据集,可以尝试从多模式常识推理和程序文本生成来尝试一系列应用程序。

We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源