论文标题
根:以对象为中心的表示和3D场景的渲染
ROOTS: Object-Centric Representation and Rendering of 3D Scenes
论文作者
论文摘要
人类智能的关键能力是从部分场景观察中建立单个3D对象的模型。最近的作品实现了以对象为中心的生成,但没有推断表示形式的能力,也没有实现3D场景表示学习,但没有以对象为中心的组成。因此,学习以中心为中心的构图来代表和渲染3D场景仍然难以捉摸。在本文中,我们提出了一个概率的生成模型,用于学习从多对象场景的部分观察中构建模块化和组成3D对象模型。所提出的模型可以(i)通过学习搜索和分组对象区域以及(ii)从任意观点呈现不仅单个对象,而且还通过合成对象来推断3D对象表示。整个学习过程是无监督和端到端的。在实验中,除了发电质量外,我们还证明了学习的表示允许对象操纵和新颖的场景生成,并将其推广到各种环境。结果可以在我们的项目网站上找到:https://sites.google.com/view/roots3d
A crucial ability of human intelligence is to build up models of individual 3D objects from partial scene observations. Recent works achieve object-centric generation but without the ability to infer the representation, or achieve 3D scene representation learning but without object-centric compositionality. Therefore, learning to represent and render 3D scenes with object-centric compositionality remains elusive. In this paper, we propose a probabilistic generative model for learning to build modular and compositional 3D object models from partial observations of a multi-object scene. The proposed model can (i) infer the 3D object representations by learning to search and group object areas and also (ii) render from an arbitrary viewpoint not only individual objects but also the full scene by compositing the objects. The entire learning process is unsupervised and end-to-end. In experiments, in addition to generation quality, we also demonstrate that the learned representation permits object-wise manipulation and novel scene generation, and generalizes to various settings. Results can be found on our project website: https://sites.google.com/view/roots3d