How can I turn my raw images and true labels into an (X, y) array for deep learning?

The DLS Specialization does not cover preparing data for use with Machine Learning models. That’s a big topic that is generally called “Data Science”. You can find lots of courses on that whole area, including the Practical Data Science Specialization from Deeplearning.ai.

But your question is very specific and it does have a straightforward answer: if you are building a Deep Learning model that processes images, you need to decide on a fixed resolution and image type (RGB, CMYK, greyscale …) for the input images and a particular “label” format. You then will need to convert any inputs into the chosen image representation and size. Fortunately any decent image library will provide “resize” and type conversion functions. For example, take a look at the Logistic Regression assignment in Week 2 and see the section at the end titled “Test with Your Own Image”. In that section, they give you the code to resize your images to 64 x 64. It’s a simple line of code using the python Image library.

As you go through the Week 2 assignment, you can also look at how they read in the input images. See the routine load_dataset which is in one of the accompanying utility files (see the FAQ Thread for instructions about how to open such a file). You’ll see that the images and labels are packaged as an “h5” format database file. Here’s a thread from a fellow student who did the work to figure out how to create an h5 file with images and labels.