医疗图像通过生成预估计的变压器字幕

论文标题

医疗图像通过生成预估计的变压器字幕

Medical Image Captioning via Generative Pretrained Transformers

论文作者

Selivanov, Alexander, Rogov, Oleg Y., Chesakov, Daniil, Shelmanov, Artem, Fedulova, Irina, Dylov, Dmitry V.

论文摘要

自动临床标题生成问题被称为提议的模型，将额叶X射线扫描与放射学记录中的结构化患者信息结合在一起。我们将两种语言模型结合在一起，即表演台和GPT-3，以生成全面和描述性的放射学记录。这些模型的建议组合产生了文本摘要，其中包含有关发现的病理，其位置以及将每个病理定位在原始X射线扫描中的每个病理的2D热图。提出的模型在两个医学数据集（Open-I，Mimic-CXR和通用MS-Coco）上进行了测试。用自然语言评估指标测量的结果证明了它们对胸部X射线图像字幕的有效适用性。

The automatic clinical caption generation problem is referred to as proposed model combining the analysis of frontal chest X-Ray scans with structured patient information from the radiology records. We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The proposed combination of these models generates a textual summary with the essential information about pathologies found, their location, and the 2D heatmaps localizing each pathology on the original X-Ray scans. The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO. The results measured with the natural language assessment metrics prove their efficient applicability to the chest X-Ray image captioning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题