论文标题
筛子:关于边缘和云的语义编码的视频分析
SiEVE: Semantically Encoded Video Analytics on Edge and Cloud
论文作者
论文摘要
计算机视觉和神经网络的最新进展使得可以通过算法而不是人类自动搜索和分析更多的监视视频。这与边缘计算的进步同时进行,其中在包含边缘设备(靠近视频源的边缘设备)上分析了视频。但是,当前的视频分析管道在处理此类进步时有几个缺点。例如,视频编码器已经设计了很长时间,以取悦人类观众,并不可知下游分析任务(例如,对象检测)。此外,大多数视频分析系统都利用2层体系结构将编码的视频发送到远程云或专用边缘服务器,但并不能有效利用它们的两个。为了响应这些进步,我们提出了Sieve,这是一个3层视频分析系统,以减少潜伏期并增加视频流的分析吞吐量。在筛子中,我们提出了一种新型技术,可以检测压缩视频流中的对象。我们将此技术称为语义视频编码,因为它允许视频编码器了解下游任务的语义(例如对象检测)。我们的结果表明,通过利用语义视频编码,我们仅通过解压缩的视频帧的3.5%实现了接近100%的对象检测精度,与对每个视频帧进行解压缩的经典方法相比,这会导致100倍以上的速度。
Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has several disadvantages when dealing with such advances. For example, video encoders have been designed for a long time to please human viewers and be agnostic of the downstream analysis task (e.g., object detection). Moreover, most of the video analytics systems leverage 2-tier architecture where the encoded video is sent to either a remote cloud or a private edge server but does not efficiently leverage both of them. In response to these advances, we present SIEVE, a 3-tier video analytics system to reduce the latency and increase the throughput of analytics over video streams. In SIEVE, we present a novel technique to detect objects in compressed video streams. We refer to this technique as semantic video encoding because it allows video encoders to be aware of the semantics of the downstream task (e.g., object detection). Our results show that by leveraging semantic video encoding, we achieve close to 100% object detection accuracy with decompressing only 3.5% of the video frames which results in more than 100x speedup compared to classical approaches that decompress every video frame.