使用序列优化的CNN模型对点云序列的无损压缩

论文标题

使用序列优化的CNN模型对点云序列的无损压缩

Lossless Compression of Point Cloud Sequences Using Sequence Optimized CNN Models

论文作者

Kaya, Emre Can, Tabus, Ioan

论文摘要

我们提出了一个用于编码点云序列的几何形状的新范式，其中估算编码分布的卷积神经网络（CNN）在要压缩的序列的几个帧上进行了优化。我们采用轻巧的CNN结构，作为编码过程的一部分进行培训，并且CNN参数作为BITSTREAM的一部分传输。新提出的编码方案在每个点云的OCTREE表示上运行，连续编码每个OCTREE分辨率层。在每个OCTREE分辨率层中，体素网格都是逐个横断的（每个部分垂直于选定的坐标轴），在每个部分中，在单个Arithmetic编码操作中，都会一次编码两二个体素的组占领。根据有关OCTREE的当前和下分辨率层中邻居体素的占用的信息，为每个二二二个体素组定义了条件编码分布的上下文。 CNN估计了四个阶段中一个部分的所有体素组占用模式的概率分布。在每个新阶段中，上下文都会随上一个阶段中编码的占领而更新，并且每个阶段都在并行估算概率，从而在处理的并行性和上下文的信息性之间进行了合理的权衡。 CNN培训时间与剩余的编码步骤所花费的时间相媲美，从而导致竞争性的整体编码时间。比特率和编码时间与最近发布的压缩方案相比，比较时间相比。

We propose a new paradigm for encoding the geometry of point cloud sequences, where the convolutional neural network (CNN) which estimates the encoding distributions is optimized on several frames of the sequence to be compressed. We adopt lightweight CNN structures, we perform training as part of the encoding process, and the CNN parameters are transmitted as part of the bitstream. The newly proposed encoding scheme operates on the octree representation for each point cloud, encoding consecutively each octree resolution layer. At every octree resolution layer, the voxel grid is traversed section-by-section (each section being perpendicular to a selected coordinate axis) and in each section the occupancies of groups of two-by-two voxels are encoded at once, in a single arithmetic coding operation. A context for the conditional encoding distribution is defined for each two-by-two group of voxels, based on the information available about the occupancy of neighbor voxels in the current and lower resolution layers of the octree. The CNN estimates the probability distributions of occupancy patterns of all voxel groups from one section in four phases. In each new phase the contexts are updated with the occupancies encoded in the previous phase, and each phase estimates the probabilities in parallel, providing a reasonable trade-off between the parallelism of processing and the informativeness of the contexts. The CNN training time is comparable to the time spent in the remaining encoding steps, leading to competitive overall encoding times. Bitrates and encoding-decoding times compare favorably with those of recently published compression schemes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题