论文标题

在线性整流器中克服过度拟合和大量重量更新问题:阈值指数构成线性单元

Overcoming Overfitting and Large Weight Update Problem in Linear Rectifiers: Thresholded Exponential Rectified Linear Units

论文作者

Pandey, Vijay

论文摘要

在过去的几年中,线性整流单位激活函数在神经网络中表现出其重要性,超过了乙状结肠激活的性能。 Relu(Nair&Hinton,2010年),Elu(Clevert等,2015),Prelu(He等,2015),Lrelu(Maas等,2013),Srelu(Jin等,2016),ThresholdedRelu,ThresholdedRelu,所有这些线性重置激活功能都具有其自身在其他方面的意义。大多数时候,由于非零输出均值,这些激活函数遭受了偏置转移问题,并且由于单位梯度而引起的深层复杂网络中的高权重更新问题,这会导致训练较慢和模型预测的较高差异。在本文中,我们提出了“阈值指数的正直线性单元”(TERELU)激活功能,该功能在减轻过度拟合方面更好地工作:重量更新问题。除了减轻过度拟合的问题外,与其他线性整流器相比,该方法还提供了大量的非线性。考虑到与其他激活相比,我们将使用神经网络在各种数据集上显示更好的性能。

In past few years, linear rectified unit activation functions have shown its significance in the neural networks, surpassing the performance of sigmoid activations. RELU (Nair & Hinton, 2010), ELU (Clevert et al., 2015), PRELU (He et al., 2015), LRELU (Maas et al., 2013), SRELU (Jin et al., 2016), ThresholdedRELU, all these linear rectified activation functions have its own significance over others in some aspect. Most of the time these activation functions suffer from bias shift problem due to non-zero output mean, and high weight update problem in deep complex networks due to unit gradient, which results in slower training, and high variance in model prediction respectively. In this paper, we propose, "Thresholded exponential rectified linear unit" (TERELU) activation function that works better in alleviating in overfitting: large weight update problem. Along with alleviating overfitting problem, this method also gives good amount of non-linearity as compared to other linear rectifiers. We will show better performance on the various datasets using neural networks, considering TERELU activation method compared to other activations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源