Week 2 Assignment: Curious why better results on the dev set

It seems that the results were better on the validation set than training set in the “Transfer Learning with MobileNet” Assignment. I think the training set and the validation set should have the same distribution because they were randomly picked from the same data set. Or is it because the amount of data is too small so the result is kind of random ? I’m curious about the reason. Thank you!

Another try.

When performing data augmentation, training accuracy is likely to drop below validation accuracy. Please train the model longer.