论文标题
在Exascale超级计算机中心中预测错误和恢复技术的高能中子通量的计算
Calculation of the High-Energy Neutron Flux for Anticipating Errors and Recovery Techniques in Exascale Supercomputer Centres
论文作者
论文摘要
Exascale计算的年龄已经到来,随着计算能力的增加,与中子和其他大气辐射相关的风险变得越来越关键,因此,由于这种辐射,失败之间的预期平均时间将减少。在这项工作中,提出了超过50 meV的能量的中子通量的新的详细计算。这是通过使用最先进的蒙特卡洛肌肉粒子技术来完成的,并在接下来的23个Exascale超级计算设施中的每一个中都包括真实的大气轮廓。观察并表征了通量和季节性变化的大气影响,并获得了每个部位的高能中子的气压系数。借助这些系数,可以通过在每个Exascale设施的关键任务中分配资源到关键任务之前,就可以通过使用大气压力来预测,与能量中子的通量增加相关的潜在错误风险,例如单个事件的发生或瞬态。为了更加清楚,有关如何包括宇宙射线的故障率如何影响,管理员将更好地预期克服错误可能会采取哪些或多或少的限制性措施。
The age of exascale computing has arrived and the risks associated with neutron and other atmospheric radiation are becoming more critical as the computing power increases, hence, the expected Mean Time Between Failures will be reduced because of this radiation. In this work, a new and detailed calculation of the neutron flux for energies above 50 MeV is presented. This has been done by using state-of-the-art Monte Carlo astroparticle techniques and including real atmospheric profiles at each one of the next 23 exascale supercomputing facilities. Atmospheric impact in the flux and seasonal variations were observed and characterised, and the barometric coefficient for high-energy neutrons at each site was obtained. With these coefficients, potential risks of errors associated with the increase in the flux of energetic neutrons, such as the occurrence of single event upsets or transients, and the corresponding failure-in-time rates, can be anticipated just by using the atmospheric pressure before the assignation of resources to critical tasks at each exascale facility. For more clarity, examples about how the rate of failures is affected by the cosmic rays are included, so administrators will better anticipate which more or less restrictive actions could take for overcoming errors.