How to deal with an image dataset?

So far I have only worked with .csv dataset files so when you have a dataset of images how do you work with it ? Do you convert the images to csv files or what other ways are there ?

1 Like

Image files are generally converted into 2D matrices where each element is the value of one pixel. If it is a color image, you’ll usually have a set of 2D matrices, formatted as a 3D matrix, where the 3rd dimension denotes the three color plane (red, blue, green).

So the format would be (h, w, c), for height, width, and color.

I believe you’ll see this in the CNN course.

1 Like

Right! We’ve been dealing with image files since DLS C1 W2. There is no need to convert the files to CSV files: there are a number of types of images (JPEG, TIFF, PNG, …) and there are a number of compound file formats like h5, tar and zip for packaging multiple image files into a single file. In the courses here, you will also see them presented as a directory full of .jpg files. They then give you the logic to traverse the file tree and load the individual images.

You can find plenty of examples in the various course assignments. E.g. in DLS C1 W2 Logistic Regression, look at the cell that calls the load_dataset function. Then find the source for load_dataset by clicking “File → Open” and open the file lr_utils.py and see the source for that function. You’ll find that it uses the H5 file format and a python library for dealing with H5 files.

In DLS C4 W3 in the U-Net assignment, you’ll see an example where the images are in directories and are in PNG format. There they use the imageio library to access the image files.

In DLS C4 W2 in the Transfer Learning with MobilNet assignment, you’ll see an example of directories with JPG files and they use TF functions to access the files.

A number of the other assignments in DLS C4 deal with images, so watch what happens with the input data.

You can also go look at many of the Kaggle projects, e.g. the famous “Cats and Dogs” project. They give you the datasets and explain how to access them.

Or take a look at MNIST for an example of smaller grey scale images.