论文标题
放大您的人口:统计缩小以增加社会经济普查数据的空间解决
Magnify Your Population: Statistical Downscaling to Augment the Spatial Resolution of Socioeconomic Census Data
论文作者
论文摘要
人口统计和社会经济属性的精细分辨率估计对于计划和政策制定至关重要。尽管已经做出了几项努力来产生精细的网格人口估计,但社会经济特征通常不比人口普查单元更细,这可能隐藏了当地的异质性和差异。在本文中,我们提出了一种新的统计缩减方法,以得出关键社会经济属性的精细规模估计。该方法利用人口统计学和地理广泛的协变量在多个尺度上可用,并且仅在粗分辨率下可用的其他人口普查协变量可用,该协变量可在模型中包括在模型中。对于每个选定的社会经济变量,在源普查单元上训练了一个随机森林模型,然后用于生成精细的网格预测,然后对其进行调整,以确保与较粗糙的人口普查数据的最佳一致性。作为一个案例研究,我们将此方法应用于美国的人口普查数据,将所选社会经济变量降低到块组级别可用,并将其网格降低到约300个空间分辨率。该方法的准确性在两个空间尺度上进行了评估,首先计算伪跨验证系数确定对块组水平的预测的确定系数,然后仅对于广泛的变量,也针对(未调整的)预测计数是块组汇总的计数。根据这些分数和检查缩小的地图的检查,我们得出的结论是,与可用的普查数据相比,我们的方法能够提供准确,更光滑且更详细的社会经济估计。
Fine resolution estimates of demographic and socioeconomic attributes are crucial for planning and policy development. While several efforts have been made to produce fine-scale gridded population estimates, socioeconomic features are typically not available at scales finer than Census units, which may hide local heterogeneity and disparity. In this paper we present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes. The method leverages demographic and geographical extensive covariates available at multiple scales and additional Census covariates only available at coarse resolution, which are included in the model hierarchically within a "forward learning" approach. For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions, which are then adjusted to ensure the best possible consistency with the coarser Census data. As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of ~300 spatial resolution. The accuracy of the method is assessed at both spatial scales, first computing a pseudo cross-validation coefficient of determination for the predictions at the block group level and then, for extensive variables only, also for the (unadjusted) predicted counts summed by block group. Based on these scores and on the inspection of the downscaled maps, we conclude that our method is able to provide accurate, smoother, and more detailed socioeconomic estimates than the available Census data.