After working further on the problem, you’ve decided to correct the incorrectly labeled data on the dev set. Which of these statements do you agree with? (Check all that apply).

  1. How does fixing incorrectly label data in the training set change its distribution?
  2. Why is it the DL algorithms are robust to having slightly different train and dev distributions?

