神经匹配场：视觉对应的匹配字段的隐式表示

论文标题

神经匹配场：视觉对应的匹配字段的隐式表示

Neural Matching Fields: Implicit Representation of Matching Fields for Visual Correspondence

论文作者

Hong, Sunghwan, Nam, Jisu, Cho, Seokju, Hong, Susung, Jeon, Sangryul, Min, Dongbo, Kim, Seungryong

论文摘要

现有的语义通信管道通常包括提取针对阶层内变化和背景剪辑器的不变性的高级语义特征。但是，这种体系结构不可避免地会导致一个低分辨率的匹配场，此外还需要进行临时插值过程，作为将其转换为高分辨率的后处理，无疑限制了匹配结果的整体性能。为了克服这一点，受隐式神经表示的最新成功的启发，我们提出了一种新颖的语义通信方法，称为神经匹配场（NEMF）。但是，4D匹配字段的复杂性和高维度是主要的障碍，我们建议通过以下完整连接的网络来处理一个成本嵌入网络，以处理一个粗糙的成本量，以作为建立高精度匹配字段的指导。然而，学习高维匹配字段仍然具有挑战性，这主要是由于计算复杂性，因为天真的详尽推论将需要从4D空间中的所有像素中查询以推断像素的对应关系。为了克服这一点，我们提出了足够的训练和推理程序，在训练阶段，我们随机样本匹配候选者，在推理阶段，我们在测试时间进行基于补丁的推理和坐标优化。有了这些结合的竞争结果，可以在几个标准基准测试中获得语义通信。代码和预训练的权重可从https://ku-cvlab.github.io/nemf/获得。

Existing pipelines of semantic correspondence commonly include extracting high-level semantic features for the invariance against intra-class variations and background clutters. This architecture, however, inevitably results in a low-resolution matching field that additionally requires an ad-hoc interpolation process as a post-processing for converting it into a high-resolution one, certainly limiting the overall performance of matching results. To overcome this, inspired by recent success of implicit neural representation, we present a novel method for semantic correspondence, called Neural Matching Field (NeMF). However, complicacy and high-dimensionality of a 4D matching field are the major hindrances, which we propose a cost embedding network to process a coarse cost volume to use as a guidance for establishing high-precision matching field through the following fully-connected network. Nevertheless, learning a high-dimensional matching field remains challenging mainly due to computational complexity, since a naive exhaustive inference would require querying from all pixels in the 4D space to infer pixel-wise correspondences. To overcome this, we propose adequate training and inference procedures, which in the training phase, we randomly sample matching candidates and in the inference phase, we iteratively performs PatchMatch-based inference and coordinate optimization at test time. With these combined, competitive results are attained on several standard benchmarks for semantic correspondence. Code and pre-trained weights are available at https://ku-cvlab.github.io/NeMF/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题