论文标题

J-Plus DR3:Galaxy-Star-Quasar分类

J-PLUS DR3: Galaxy-Star-Quasar classification

论文作者

von Marttens, R., Marra, V., Quartin, M., Casarini, L., Baqui, P. O., Alvarez-Candal, A., Galindo-Guil, F. J., Fernández-Ontiveros, J. A., del Pino, Andrés, Díaz-García, L. A., López-Sanjuan, C., Alcaniz, J., Angulo, R., Cenarro, A. J., Cristóbal-Hornillos, D., Dupke, R., Ederoclite, A., Hernández-Monteagudo, C., Marín-Franch, A., Moles, M., Sodré, L., Varela, J., Ramió, H. Vázquez

论文摘要

Javalambre光度局部宇宙调查(J-Plus)是使用83厘米JAST望远镜的12频段光度测量。数据发布3包括4740万个来源。 J-Plus DR3仅提供星形 - 甲带趋势分类,因此不能从其他来源鉴定出类星体。考虑到数据集的大小,机器学习方法可以提供有效的替代分类和解决类星体分类的解决方案。我们的目标是将J-Plus DR3源分类为星系,星星和类星体,以优于每个类中可用的分类器。我们使用称为TPOT的自动化机器学习工具来查找优化的管道来执行分类。监督的机器学习算法在与SDSS DR18,Lamost DR8和Gaia的交叉匹配上进行了培训。我们检查了大约6.6万星系,120万星和2.7万类星的训练集都是代表性的,并且含有最小的污染物(少于1%)。我们考虑了37个特征:具有各自误差,六种颜色,四个形态参数的十二个光度带,其误差的银河灭绝和相对于相应指向的PSF。使用TPOT遗传算法,我们发现XGBoost提供了最佳性能:星系,恒星和类星体的AUC高于0.99,星系和恒星的平均精度高于0.99,而类星体的平均精度为0.96。 XGBOOST优于J-Plus DR3中已经提供的分类器,并且对类星体进行了分类。

The Javalambre Photometric Local Universe Survey (J-PLUS) is a 12-band photometric survey using the 83-cm JAST telescope. Data Release 3 includes 47.4 million sources. J-PLUS DR3 only provides star-galaxy classification so that quasars are not identified from the other sources. Given the size of the dataset, machine learning methods could provide a valid alternative classification and a solution to the classification of quasars. Our objective is to classify J-PLUS DR3 sources into galaxies, stars and quasars, outperforming the available classifiers in each class. We use an automated machine learning tool called TPOT to find an optimized pipeline to perform the classification. The supervised machine learning algorithms are trained on the crossmatch with SDSS DR18, LAMOST DR8 and Gaia. We checked that the training set of about 660 thousand galaxies, 1.2 million stars and 270 thousand quasars is both representative and contain a minimal presence of contaminants (less than 1%). We considered 37 features: the twelve photometric bands with respective errors, six colors, four morphological parameters, galactic extinction with its error and the PSF relative to the corresponding pointing. With TPOT genetic algorithm, we found that XGBoost provides the best performance: the AUC for galaxies, stars and quasars is above 0.99 and the average precision is above 0.99 for galaxies and stars and 0.96 for quasars. XGBoost outperforms the classifiers already provided in J-PLUS DR3 and also classifies quasars.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源