sequence model course 4, week 3, the last assignment : Trigger_word_detection_v2a
We use this model :
composed of the following 4 layers :
- A convolutional layer
- Two GRU layers
- A dense layer.
There is one thing I do not understand
1 st layer :batchnorm, RelU, dropout. batchnorm is done before the dropout
2 nd layer : dropout done before batch norm
3 rd layer : dropout, batchnorm, dropout
When is dropout done before batchnorm ? when is done batchnorm before dropout ? When does the order matters ? for what goal ?
I know what batchnorm and dropout does and why they are also applied
I was able so far to get an intuition of the models, here I do not. Could someone provide an explanation (at least for the intuition please) ? Thank you already