How to use the train-dev set?

Hi we used to have three data sets, train, dev and test sets and these set are being used for

  1. We are optimizing the model’s weights and biases through forward and backward propagation to learn the best predictions with train set.

  2. The dev set allows you to estimate the model’s generalization performance and take actions such as early stopping or selecting the best model based on its performance on this data.

  3. After training the model, you use the test set to assess the model’s final performance. It provides an evaluation of how well the model generalizes to unseen data in a real-world scenario. The test set is crucial for validating the model’s performance before deploying it or reporting results.

train-dev set is introduced for bias and variance with mismatched data distributions between train and dev/test set. But not sure how to use train-dev set exactly.

No, that’s not correct.

The training set is used for training, to learn the weight values that minimize the cost. It’s paragraph 1 in your original post. Except we don’t “learn the best predictions”. We learn the weights that minimize the cost.

As mentioned in the lecture, train-dev set is not used for training. This set is used to figure out why the difference between training set error and dev set error is high:

  1. Was dev set a lot harder to classify since harder points in dev data were unseen during train time? (variance problem)
  2. Do train and dev sets come from different distributions? (data mismatch problem)

Once you train the NN on train-set, measure performance on training-dev and dev sets. Please revisit the lecture with the above pointers in mind.

The content of my original posting has a typo, it’s not train-set but train-dev set as I posted correctly in the title of the posting. sorry for the confusion.