论文标题
Stadre和Stadro:使用统计距离测量值对基于ML的预测的可靠性和鲁棒性估计
StaDRe and StaDRo: Reliability and Robustness Estimation of ML-based Forecasting using Statistical Distance Measures
论文作者
论文摘要
机器学习(ML)模型的可靠性估计正在成为关键主题。当将这种\ mbox {模型}部署在安全至关重要的应用中时,尤其如此,因为基于模型预测的决策可能导致危险情况。在这方面,最近的研究提出了实现安全的方法,\ mbox {可靠}和可靠的ML系统。一种这样的方法包括检测和分析分布变化,然后衡量此类系统对这些转移的反应。这是在Safeml的早期工作中提出的。这项工作着重于将SAFEML用于时间序列数据的使用,以及使用统计距离测量方法对ML重新销售方法的可靠性和鲁棒性估计。为此,探索了基于SAFEML中提出的经验累积分布函数(ECDF)的距离度量,以测量时间序列的统计距离差异(SDD)。然后,我们提出了基于SDD的可靠性估计(Stadre)和基于SDD的鲁棒性(Stadro)措施。借助聚类技术,确定了训练过程中看到的数据的统计属性与预测之间的相似性。所提出的方法能够在ML模型的数据集SDD和关键性能指标(KPI)之间提供链接。
Reliability estimation of Machine Learning (ML) models is becoming a crucial subject. This is particularly the case when such \mbox{models} are deployed in safety-critical applications, as the decisions based on model predictions can result in hazardous situations. In this regard, recent research has proposed methods to achieve safe, \mbox{dependable}, and reliable ML systems. One such method consists of detecting and analyzing distributional shift, and then measuring how such systems respond to these shifts. This was proposed in earlier work in SafeML. This work focuses on the use of SafeML for time series data, and on reliability and robustness estimation of ML-forecasting methods using statistical distance measures. To this end, distance measures based on the Empirical Cumulative Distribution Function (ECDF) proposed in SafeML are explored to measure Statistical-Distance Dissimilarity (SDD) across time series. We then propose SDD-based Reliability Estimate (StaDRe) and SDD-based Robustness (StaDRo) measures. With the help of a clustering technique, the similarity between the statistical properties of data seen during training and the forecasts is identified. The proposed method is capable of providing a link between dataset SDD and Key Performance Indicators (KPIs) of the ML models.