论文标题
BIMCV COVID-19+:来自COVID-19患者的RX和CT图像的大量注释数据集
BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients
论文作者
论文摘要
本文介绍了BIMCV COVID-19+,这是一个来自瓦伦西亚地区医学图像库(BIMCV)的大数据集,其中包含胸部X射线图像CXR CXR(CR,DX)和COVID-19+患者的计算机断层扫描(CT)成像(CT)成像,以及其放射学发现,病理学,放射性发现,prcragologies,prcragologies,prcrrase Rowtrase(prcragicies in Cragicies,prcragice),pcrrase tim ni dicom metAtaa,dic a. dica a,dic ainaa a,dica。免疫球蛋白G(IgG)和免疫球蛋白M(IgM)诊断抗体测试。这些发现已被映射到标准的统一医学语言系统(UMLS)术语上,并涵盖了广泛的胸腔实体,这与先前数据集注释的实体数量大大减少不同。图像存储在高分辨率中,实体用解剖标签本地化,并存储在医学成像数据结构(MIDS)格式中。此外,一组放射科医生注释了10张图像,以包括放射学发现的语义分割。数据库的第一次迭代包括1,380 CX,885 DX和163个CT研究,来自1,311 Covid-19+患者。据我们所知,这是最大的Covid-19+数据集的开放格式可用图像。该数据集可以从http://bimcv.cipf.es/bimcv-projects/bimcv-covid19下载。
This paper describes BIMCV COVID-19+, a large dataset from the Valencian Region Medical ImageBank (BIMCV) containing chest X-ray images CXR (CR, DX) and computed tomography (CT) imaging of COVID-19+ patients along with their radiological findings and locations, pathologies, radiological reports (in Spanish), DICOM metadata, Polymerase chain reaction (PCR), Immunoglobulin G (IgG) and Immunoglobulin M (IgM) diagnostic antibody tests. The findings have been mapped onto standard Unified Medical Language System (UMLS) terminology and cover a wide spectrum of thoracic entities, unlike the considerably more reduced number of entities annotated in previous datasets. Images are stored in high resolution and entities are localized with anatomical labels and stored in a Medical Imaging Data Structure (MIDS) format. In addition, 10 images were annotated by a team of radiologists to include semantic segmentation of radiological findings. This first iteration of the database includes 1,380 CX, 885 DX and 163 CT studies from 1,311 COVID-19+ patients. This is, to the best of our knowledge, the largest COVID-19+ dataset of images available in an open format. The dataset can be downloaded from http://bimcv.cipf.es/bimcv-projects/bimcv-covid19.