In the Regularization programming assignment, while showing how dropout works, the code uses keep_prob=0.86. I didn’t find an explanation on why it’s that value exactly and I was wondering what would be a good way to choose this value?
This is a specific example of “hyperparameter tuning”, which was the general topic covered in Week 1 of Course 2. It’s the usual story: you try different values until you find the one that works best. Since the purpose of regularization is to eliminate overfitting, you would run the training with a particular keep_prob value and then look at the training accuracy and validation accuracy. Rinse and repeat, until you get the best result that you can find.
Of course the implication of the above is that there is no magic “silver bullet” value of keep_prob for all scenarios. Even with the same datasets, it may vary depending on your network architecture as well.