强大而轻量级的深度注意多次实例学习算法，用于预测遗传变化

论文标题

强大而轻量级的深度注意多次实例学习算法，用于预测遗传变化

A robust and lightweight deep attention multiple instance learning algorithm for predicting genetic alterations

论文作者

Guo, Bangwei, Li, Xingyu, Yang, Miaomiao, Zhang, Hong, Xu, Xu Steven

论文摘要

基于全坡度数字病理图像（WSI）的深度学习模型在预测分子生物标志物方面变得越来越流行。基于实例的模型是使用WSI来预测遗传变化的主流策略，尽管基于Bag的模型以及基于自我注意机制的算法已针对其他数字病理应用提出。在本文中，我们提出了一种基于注意力的多个实例突变学习（AMIML）模型，用于预测基因突变。 Amiml由连续的1D卷积层，解码器和残留重量连接组成，以促进轻巧的注意机制进一步整合以检测最预测性的图像贴片。我们使用来自癌症基因组四个癌症队列（TCGA）研究（UCEC，BRCA，GBM和KIRC）的24个临床相关基因的数据，我们将AMIML与一种流行的基于实例的模型和四个最近出版的基于袋子的模型（例如Chowder，HE2RNA等）进行了比较。 Amiml表现出极好的鲁棒性，不仅表现出绝大多数测试基因中的所有五种基线算法（24个中的17个），而且还为其他七个基因提供了近得最好的表现。相反，基线发表算法的性能在不同的癌症/基因上有所不同。此外，与已发表的遗传变化模型相比，Amiml为预测广泛的基因（例如KIRC; ERBB2，BRCA1和BRCA2的BRCA; JAK1; JAK1; JAK1，POL）以及其他相关性的cute cute inte inte Inte Inte Inte Inte Inte Inte Inte Inte Inte Inteantive的模型提供了显着改进（例如，KIRB; ERBB2，BRCA1和BRCA2的广泛基因（例如KMT2C，TP53和SETD2）提供了重大改进。文学。此外，使用灵活且可解释的基于注意力的MIL合并机制，Amiml可以进一步零输入并检测预测图像贴片。

Deep-learning models based on whole-slide digital pathology images (WSIs) become increasingly popular for predicting molecular biomarkers. Instance-based models has been the mainstream strategy for predicting genetic alterations using WSIs although bag-based models along with self-attention mechanism-based algorithms have been proposed for other digital pathology applications. In this paper, we proposed a novel Attention-based Multiple Instance Mutation Learning (AMIML) model for predicting gene mutations. AMIML was comprised of successive 1-D convolutional layers, a decoder, and a residual weight connection to facilitate further integration of a lightweight attention mechanism to detect the most predictive image patches. Using data for 24 clinically relevant genes from four cancer cohorts in The Cancer Genome Atlas (TCGA) studies (UCEC, BRCA, GBM and KIRC), we compared AMIML with one popular instance-based model and four recently published bag-based models (e.g., CHOWDER, HE2RNA, etc.). AMIML demonstrated excellent robustness, not only outperforming all the five baseline algorithms in the vast majority of the tested genes (17 out of 24), but also providing near-best-performance for the other seven genes. Conversely, the performance of the baseline published algorithms varied across different cancers/genes. In addition, compared to the published models for genetic alterations, AMIML provided a significant improvement for predicting a wide range of genes (e.g., KMT2C, TP53, and SETD2 for KIRC; ERBB2, BRCA1, and BRCA2 for BRCA; JAK1, POLE, and MTOR for UCEC) as well as produced outstanding predictive models for other clinically relevant gene mutations, which have not been reported in the current literature. Furthermore, with the flexible and interpretable attention-based MIL pooling mechanism, AMIML could further zero-in and detect predictive image patches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题