卷积层为什么不使用dropout?

阅读tflearn发现conv_2d函数没有dropout的参数设置，就好奇搜了下原因，发现在reddit有人问过这个问题。

BugCreater · 2018-10-15 23:09:13

搜到的链接是：Why do I never see dropout applied in convolutional layers? : MachineLearning
里面提到几篇论文，等我先占坑，看明白再写
--------------------------------------------------------------------------------------
等待大牛的回答

2prime · 2018-10-15 23:09:14

drop out是用来防止过拟合的过多参数才会容易过拟合
所以卷积层没必要

Ng课里也说越多参数的全链接层要drop out的概率就大一点

所以一般都用Spatial Dropout

然而这个对学数学的太玄学【逃跑

Cyril · 2018-10-15 23:09:15

可以用，在dropout论文里作者也提到了在卷积层使用dropout（toronto.edu 的页面）

The additional gain in performance obtained by adding dropout in the convolutional
layers (3:02% to 2:55%) is worth noting. One may have presumed that since the convolutional layers don’t have a lot of parameters, overfitting is not a problem and therefore
dropout would not have much effect. However, dropout in the lower layers still helps because it provides noisy inputs for the higher fully connected layers which prevents them
from overfitting.

我们最近做输入是一维信号的CNN的时候也发现，在第一层卷积层加dropout确实提高了泛化能力，因为相当于在输入加入了噪声，但是要比较小心，小卷积核的时候用了反而效果变差了。

韩宇露 · 2018-10-15 23:09:16

刚刚看了 Deep Sparse Rectifier Neural Networks 这篇paper。看到这幅图

ReLUs的函数表达式为f(x)=max(0,x)，当wX+b0，没有被强行抑制成0，所以这个时候需要使用Dropout来屏蔽部分神经元以防止过拟合。
————————————
我感觉自己想错了，ReLU压0只是输出下层的值为0而已，然而实际上，该神经元仍然有着它的权重可以进行后项传播进而调整权重，并没有起到类似dropout的作用

陈华杰 · 2018-10-15 23:09:17

dropout操作是个单独的操作，tensorflow有现成的函数可以调用。可以自己添加这个操作。

墨仆 · 2018-10-15 23:09:19

好像是说1*1的卷积层前用dropout的话，会增加训练时间且不会减少过拟合，后来就提出Spatial Dropout。(Tompson, Goroshin, Jain, LeCun, & Bregler, 2015)

卷积层为什么不使用dropout?

6 个回复