论文标题
在自我监督学习中的不确定性和鲁棒性的基准
Benchmark for Uncertainty & Robustness in Self-Supervised Learning
论文作者
论文摘要
自我监督学习(SSL)对于现实世界应用至关重要,尤其是在渴望数据的领域,例如医疗保健和自动驾驶汽车。除了缺乏标记的数据外,这些应用程序还遭受了分配变化的影响。因此,SSL方法应在测试数据集中提供强大的概括和不确定性估计,以被视为此类高风险域中的可靠模型。但是,现有方法通常集中于概括,而无需评估模型的不确定性。因此,比较改进这些估计的SSL技术的能力对于研究自学模型的可靠性至关重要。在本文中,我们探讨了SSL方法的变体,包括拼图拼图,上下文,旋转,视觉的几何变换预测以及BERT和GPT的语言任务。我们训练SSL进行辅助学习,以进行视觉和语言模型的预训练,然后评估不同分布协变量数据集(包括MNIST-C,CIFAR-10-C,CIFAR-10.1,CIFAR-10.1和MNLI)跨不同分布协方差数据集(包括分类精度)和不确定性(预期校准误差)(预期校准误差)。我们的目标是创建一个基准,该基准通过实验的输出,为可靠的机器学习中的新SSL方法提供了一个起点。所有要复制结果的源代码均可在https://github.com/hamanhbui/reliable_ssl_baselines上获得。
Self-Supervised Learning (SSL) is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars. In addition to a lack of labeled data, these applications also suffer from distributional shifts. Therefore, an SSL method should provide robust generalization and uncertainty estimation in the test dataset to be considered a reliable model in such high-stakes domains. However, existing approaches often focus on generalization, without evaluating the model's uncertainty. The ability to compare SSL techniques for improving these estimates is therefore critical for research on the reliability of self-supervision models. In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks. We train SSL in auxiliary learning for vision and pre-training for language model, then evaluate the generalization (in-out classification accuracy) and uncertainty (expected calibration error) across different distribution covariate shift datasets, including MNIST-C, CIFAR-10-C, CIFAR-10.1, and MNLI. Our goal is to create a benchmark with outputs from experiments, providing a starting point for new SSL methods in Reliable Machine Learning. All source code to reproduce results is available at https://github.com/hamanhbui/reliable_ssl_baselines.