论文标题
神经熵估计
Neural Joint Entropy Estimation
论文作者
论文摘要
估计离散随机变量的熵是信息理论和相关领域的基本问题。此问题在各个领域中都有许多应用程序,包括机器学习,统计和数据压缩。多年来,已经提出了各种估计方案。但是,尽管取得了重大进展,但与变量的字母大小相比,大多数方法仍然在样本较小时挣扎。在这项工作中,我们对这个问题介绍了一个实用的解决方案,该解决方案扩展了McAllester和Statos(2020)的工作。所提出的方案使用深神经网络(DNN)中跨凝结估计的概括能力来引入提高的熵估计精度。此外,我们介绍了相关信息理论措施(例如条件熵和相互信息)的估计量家族。我们表明,这些估计器非常一致,并在各种用例中证明了它们的性能。首先,我们考虑大型字母熵估计。然后,我们将范围扩展到相互信息估计。接下来,我们将提出的方案应用于有条件的相互信息估计,因为我们专注于独立性测试任务。最后,我们研究了转移熵估计问题。与所有测试设置中的现有方法相比,提出的估计器表明性能的提高。
Estimating the entropy of a discrete random variable is a fundamental problem in information theory and related fields. This problem has many applications in various domains, including machine learning, statistics and data compression. Over the years, a variety of estimation schemes have been suggested. However, despite significant progress, most methods still struggle when the sample is small, compared to the variable's alphabet size. In this work, we introduce a practical solution to this problem, which extends the work of McAllester and Statos (2020). The proposed scheme uses the generalization abilities of cross-entropy estimation in deep neural networks (DNNs) to introduce improved entropy estimation accuracy. Furthermore, we introduce a family of estimators for related information-theoretic measures, such as conditional entropy and mutual information. We show that these estimators are strongly consistent and demonstrate their performance in a variety of use-cases. First, we consider large alphabet entropy estimation. Then, we extend the scope to mutual information estimation. Next, we apply the proposed scheme to conditional mutual information estimation, as we focus on independence testing tasks. Finally, we study a transfer entropy estimation problem. The proposed estimators demonstrate improved performance compared to existing methods in all tested setups.