论文标题

使用SQL数据库的Sstudy软件包作为存储的蒙特卡洛模拟研究

Monte Carlo simulation studies on Python using the sstudy package with SQL databases as storage

论文作者

Inácio, Marco H A

论文摘要

绩效评估是提出新机器学习/统计估计器的过程中的关键问题。完成此类任务的一种可能方法是使用仿真研究,可以将其定义为估计和比较估计量(例如预测能力)的过程(例如预测能力)(以及其他统计),通过平均在给定的许多复制的情况下平均;即:生成一个数据集,拟合估算器,计算和存储预测能力,然后多次重复该过程,最后在存储的预测能力上平均。鉴于此,在本文中,我们提出了Sstudy:一个python软件包,旨在简化使用SQL数据库引擎作为存储系统的模拟研究的准备;更具体地说,我们介绍其基本功能,用法示例以及对其文档的参考。我们还对模拟研究程序进行了简短的统计描述,并简化了对其估计的内容以及某些应用程序示例的简化说明。

Performance assessment is a key issue in the process of proposing new machine learning/statistical estimators. A possible method to complete such task is by using simulation studies, which can be defined as the procedure of estimating and comparing properties (such as predictive power) of estimators (and other statistics) by averaging over many replications given a true distribution; i.e.: generating a dataset, fitting the estimator, calculating and storing the predictive power, and then repeating the procedure many times and finally averaging over the stored predictive powers. Given that, in this paper, we present sstudy: a Python package designed to simplify the preparation of simulation studies using SQL database engines as the storage system; more specifically, we present its basic features, usage examples and references to the its documentation. We also present a short statistical description of the simulation study procedure with a simplified explanation of what is being estimated by it, as well as some examples of applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源