论文标题
癌症拷贝数改变的拓扑数据分析
Topological Data Analysis of copy number alterations in cancer
论文作者
论文摘要
鉴定癌症活检样本的亚组和特性是获得精确诊断并能够对癌症患者进行个性化治疗的关键步骤。最近的数据收集提供了癌细胞数据的全面表征,包括有关拷贝数改变(CNA)的遗传数据。我们探索了使用基于拓扑的新方法捕获癌症基因组信息中包含的信息的潜力,该方法将每个癌症样本编码为拓扑特征的持续图,即数据中代表的高维空隙。我们发现,该技术有可能在癌症体细胞遗传数据中提取有意义的低维度,并证明某些应用在癌症数据中查找子结构的可行性以及比较癌症类型的相似性。
Identifying subgroups and properties of cancer biopsy samples is a crucial step towards obtaining precise diagnoses and being able to perform personalized treatment of cancer patients. Recent data collections provide a comprehensive characterization of cancer cell data, including genetic data on copy number alterations (CNAs). We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach that encodes each cancer sample as a persistence diagram of topological features, i.e., high-dimensional voids represented in the data. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data and demonstrate the viability of some applications on finding substructures in cancer data as well as comparing similarity of cancer types.