新颖的对象观点通过重建对齐方式估计

论文标题

新颖的对象观点通过重建对齐方式估计

Novel Object Viewpoint Estimation through Reconstruction Alignment

论文作者

Banani, Mohamed El, Corso, Jason J., Fouhey, David F.

论文摘要

本文的目的是估计一个新物体的观点。标准观点估计方法通常在此任务上失败，因为它们依赖于3D模型的对齐方式或大量类别的培训数据及其相应的规范姿势。我们通过学习重建和对齐方法来克服这些局限性。我们的关键见解是，尽管我们没有明确的3D模型或预定义的规范姿势，但我们仍然可以学会估计观众框架中对象的形状，然后使用图像来提供我们的参考模型或规范姿势。特别是，我们建议学习两个网络：第一个映射图像到3D几何感知特征瓶颈，并通过图像到图像翻译损失进行训练；第二个学会了两个功能实例是否对齐。在测试时，我们的模型找到了最能使我们测试图像的瓶颈特征与参考图像保持一致的相对变换。我们通过跨不同数据集的概括，分析不同模块的影响，并对学到的特征进行定性分析来确定正在学习的表示形式以确定要对齐的方式，从而评估我们的方法估计方法。

The goal of this paper is to estimate the viewpoint for a novel object. Standard viewpoint estimation approaches generally fail on this task due to their reliance on a 3D model for alignment or large amounts of class-specific training data and their corresponding canonical pose. We overcome those limitations by learning a reconstruct and align approach. Our key insight is that although we do not have an explicit 3D model or a predefined canonical pose, we can still learn to estimate the object's shape in the viewer's frame and then use an image to provide our reference model or canonical pose. In particular, we propose learning two networks: the first maps images to a 3D geometry-aware feature bottleneck and is trained via an image-to-image translation loss; the second learns whether two instances of features are aligned. At test time, our model finds the relative transformation that best aligns the bottleneck features of our test image to a reference image. We evaluate our method on novel object viewpoint estimation by generalizing across different datasets, analyzing the impact of our different modules, and providing a qualitative analysis of the learned features to identify what representations are being learnt for alignment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题