论文标题

在不平衡的结直肠癌图像分类中进行卷积神经网络训练的两阶段重新采样

Two-Stage Resampling for Convolutional Neural Network Training in the Imbalanced Colorectal Cancer Image Classification

论文作者

Koziarski, Michał

论文摘要

数据失衡仍然是当代机器学习中的开放挑战之一。在医疗数据(例如组织病理学图像)的情况下,它尤其普遍。用于处理数据不平衡的传统数据级方法不适合图像数据:诸如SMOTE及其衍生物等过采样方法导致创造不切实际的合成观察结果,而散发采样可减少可用数据的数量,对于成功培训卷积神经网络的培训至关重要。为了减轻与过采样相关的问题,我们提出了一种新型的两阶段重新采样方法,其中我们最初使用图像空间中的过度采样技术来利用大量数据来培训卷积神经网络,然后在该网络的最后一层中对特征空间进行无效的采样。在结直肠癌图像数据集上进行的实验表明该方法的有用性。

Data imbalance remains one of the open challenges in the contemporary machine learning. It is especially prevalent in case of medical data, such as histopathological images. Traditional data-level approaches for dealing with data imbalance are ill-suited for image data: oversampling methods such as SMOTE and its derivatives lead to creation of unrealistic synthetic observations, whereas undersampling reduces the amount of available data, critical for successful training of convolutional neural networks. To alleviate the problems associated with over- and undersampling we propose a novel two-stage resampling methodology, in which we initially use the oversampling techniques in the image space to leverage a large amount of data for training of a convolutional neural network, and afterwards apply undersampling in the feature space to fine-tune the last layers of the network. Experiments conducted on a colorectal cancer image dataset indicate the usefulness of the proposed approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源