I am a bit confused from this tutorial, of this utility
Since we have a validation set. Why does the tutorial split the training set? And also why do we split the validation set again?
When calling the model.fit we say which is the training set and which is the validation set. Therefore, there’s no reason to split them right?
I am using similar code for my project but I do not define validation split in the tf.keras.utils.image_dataset_from_directory() since my dataset is already split into train, validation and test.
Hey @Marios_Constantinou,
In the tutorial’s dataset, we have a total number of 3670 images, which hasn’t been split into training and val sets yet. Now, we create a training:val split of 80:20, which divides this number into 2936 for training and 734 for validation, as can be seen in the above image.
And once we have divided the dataset into training and val, we are specifying them in the model.fit()
method, in accordance with the way, developers have designed this Tensorflow method. I hope this helps.
Regards,
Elemento
Oh yeah, since the split is happening from the same directory. In my case tho, where I have a directory for train, val and test, I don’t need to use “validation_split” , since my data is already split in their own directory. Thanks!
1 Like