Local installation of tf_utils

My labels are a sequence of numbers, which correspond to the categories, from 0 to 9. I have already converted them to one-hot. So I will have an X_training, of images that I have already “input” (I am in the process of giving them the right shape) and the corresponding Y_training. Same for testing.
From what I have seen, the input via For loops is not a bottleneck, provided that the data sets are not huge. For us beginners, that is friendlier than the more esoteric formats.

Sure, using for loops reading through files in a directory is fine, if your dataset is not that large. Once you add all the preprocessing to resize and reshape the images, you will probably want to make all of that a “preprocessing” step that you can do once and then store the resulting numpy objects using “numpy save”. That way you only have to do it once, not every time you run the training. This will become more critical if you later increase the size of your dataset.

I actually use, for a different purpose (“deblurring” of image stacks) a high-level image processing suite, “Huygens”, developed by SVI Amsterdam, that converts large sets of tiff images to H5. The ability to input tiffs is a big advantage, as the encoding is simple and understandable. As most in this group probably know, tiff is also the common output format of most research imaging systems, including electron microscopes. I highly recommend Huygens. It is not free, however.

It sounds like you are in good shape then. Maybe Huygens can also do whatever resizing or rescaling you need to do as part of the deblurring preprocessing. Then with the H5 output of that step, you’re ready feed it as input to the kind of logic we have here. You can take a look at the load_dataset code in Week 2 LR or Week 4 Assignment to see how to unpack an H5 file.