论文标题
图像分类的频率学习
Frequency learning for image classification
论文作者
论文摘要
应用于计算机视觉和信号处理的机器学习正在取得与人脑在特定任务上相当的结果,这是由于深神经网络(DNN)带来的重大改进。如今,大多数最先进的体系结构与DNN相关,但只有少数探索频域以提取有用的信息并改善结果,例如在图像处理字段中。在这种情况下,本文提出了一种探索输入图像的傅立叶变换的新方法,该方法由可训练的频过滤器组成,可促进频谱中的判别组件。此外,我们提出了一个切片程序,以允许网络从图像块的频域表示中同时学习全局和局部特征。事实证明,在选定的实验中,该方法在众所周知的DNN体系结构方面具有竞争力,具有更简单和轻巧的模型。这项工作还引起了关于最先进的DNNS架构不仅可以利用空间特征,而且还可以利用频率的讨论,以提高其在解决现实世界问题时的性能。
Machine learning applied to computer vision and signal processing is achieving results comparable to the human brain on specific tasks due to the great improvements brought by the deep neural networks (DNN). The majority of state-of-the-art architectures nowadays are DNN related, but only a few explore the frequency domain to extract useful information and improve the results, like in the image processing field. In this context, this paper presents a new approach for exploring the Fourier transform of the input images, which is composed of trainable frequency filters that boost discriminative components in the spectrum. Additionally, we propose a slicing procedure to allow the network to learn both global and local features from the frequency-domain representations of the image blocks. The proposed method proved to be competitive with respect to well-known DNN architectures in the selected experiments, with the advantage of being a simpler and lightweight model. This work also raises the discussion on how the state-of-the-art DNNs architectures can exploit not only spatial features, but also the frequency, in order to improve its performance when solving real world problems.