视频框架插值的增强双向运动估计

论文标题

视频框架插值的增强双向运动估计

Enhanced Bi-directional Motion Estimation for Video Frame Interpolation

论文作者

Jin, Xin, Wu, Longhai, Shen, Guotao, Chen, Youxin, Chen, Jie, Koo, Jayoon, Hahm, Cheul-hee

论文摘要

我们为基于运动的视频框架插值提供了一种新颖的简单而有效的算法。现有的基于运动的插值方法通常依赖于预先训练的光流模型或基于U-NET的金字塔网络进行运动估计，该运动估计要么具有较大的模型大小或有限的处理复合物和大型运动案例的能力。在这项工作中，通过仔细整合了中间方向的前射击，轻巧的特征编码器，并将相关音量与金字塔复发框架中，我们得出一个紧凑的模型，以同时估计输入框架之间的双向运动。它的尺寸是PWC-NET的15倍，但可以更可靠，更灵活地处理具有挑战性的运动案例。基于估计的双向运动，我们向前射击输入框架及其上下文特征到中间帧，并采用合成网络来估算扭曲表示的中间帧。我们的方法在各种视频框架插值基准测试中实现了出色的性能。代码和训练有素的模型可在\ url {https://github.com/srcn-ivl/ebme}上找到。

We present a novel simple yet effective algorithm for motion-based video frame interpolation. Existing motion-based interpolation methods typically rely on a pre-trained optical flow model or a U-Net based pyramid network for motion estimation, which either suffer from large model size or limited capacity in handling complex and large motion cases. In this work, by carefully integrating intermediateoriented forward-warping, lightweight feature encoder, and correlation volume into a pyramid recurrent framework, we derive a compact model to simultaneously estimate the bidirectional motion between input frames. It is 15 times smaller in size than PWC-Net, yet enables more reliable and flexible handling of challenging motion cases. Based on estimated bi-directional motion, we forward-warp input frames and their context features to intermediate frame, and employ a synthesis network to estimate the intermediate frame from warped representations. Our method achieves excellent performance on a broad range of video frame interpolation benchmarks. Code and trained models are available at \url{https://github.com/srcn-ivl/EBME}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题