论文标题

使用肿瘤分类中的内核方法进行特征选择的潜在正则化

Latent regularization for feature selection using kernel methods in tumor classification

论文作者

Palazzo, Martin, Yankilevich, Patricio, Beauseroy, Pierre

论文摘要

癌症肿瘤的转录组学的特征是成千上万的基因表达特征。可以通过机器学习技术(例如有监督的分类任务)来评估患者的预后或肿瘤阶段。特征选择是选择有助于对肿瘤进行分类的关键基因的有用方法。在这项工作中,我们提出了一种基于多个内核学习的特征选择方法,该方法导致基因的子集减少和自定义内核,该基因在支持向量分类时改善了分类性能。在功能选择过程中,该方法通过引入从非线性维度降低模型学到的潜在空间获得的无监督的结构来放松监督目标问题,从而执行新型潜在正则化。当分类器接受了与其他监督特征选择方法相比,通过提出的方法选择的特征培训分类器时,通过肿瘤分类性能获得了概括能力的提高和评估。

The transcriptomics of cancer tumors are characterized with tens of thousands of gene expression features. Patient prognosis or tumor stage can be assessed by machine learning techniques like supervised classification tasks given a gene expression profile. Feature selection is a useful approach to select the key genes which helps to classify tumors. In this work we propose a feature selection method based on Multiple Kernel Learning that results in a reduced subset of genes and a custom kernel that improves the classification performance when used in support vector classification. During the feature selection process this method performs a novel latent regularisation by relaxing the supervised target problem by introducing unsupervised structure obtained from the latent space learned by a non linear dimensionality reduction model. An improvement of the generalization capacity is obtained and assessed by the tumor classification performance on new unseen test samples when the classifier is trained with the features selected by the proposed method in comparison with other supervised feature selection approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源