Confusion about Training Set vs. Dev Set

In the first video, Training/Dev/Test sets, Prof Ng says (6:09) "the goal of the dev set or the development set is that you’re going to test different algorithms on it and see which algorithm works better.
So, does he mean that we try different hyperparameters such as different numbers of layers, different numbers of units in a layer, learning rate, etc. on the dev set, NOT training set? If so, what role does the training set play? It is confusing to me.

Thanks in advance for your answer.

Hi @jasonchen,

This topic has been commented on before, take a look at the following:

Thank you for pointing to the previous posts for me. So, it is actually that we try different hyperparameters (e.g., number of layers, units in a layer) on training set but use the dev set to evaluate the performance of different architecture. Is my understanding correct?

Sounds correct to me.

There are more details about the train/dev/test sets that are covered on course 3.

1 Like

Though I’ve read the previous discussions, can I confirm one thing? When we evaluate the performance of trained models using dev set, do we run forward propagation only one time (no back propagation) to produce predictive outcome?
Thank you!

I am also confused on this. I thought that we do not do backpropagation on Dev set. Then how can we somehow overfit to dev set? I think Andrew mentioned in the lecture that if there is big gap between test error and dev error, we might overfit to dev error and we might want to find larger dev set. Can anyone help to clarify a bit? Thanks