Week 2 : Logistic Regression with a Neural Network mindset

Anna_Lymarenko · March 22, 2022, 2:09pm

Hi all!

I successfully made it through the exercise; however, there are some points, which are still unclear to me.

Just after loading the data, there is this comment:
“We added “_orig” at the end of image datasets (train and test) because we are going to preprocess them (…) the labels train_set_y and test_set_y don’t need any preprocessing”.

Could someone please explain why exactly y labels don’t need preprocessing?

kenb · March 22, 2022, 4:00pm

Hi @Anna_Lymarenko , and welcome to the course!

The input data consists of pixel intensities (from 0 to 255) from the three primary color channels: red, green, blue. In Exercise 1, it is explained that the shape of train_set_X_orig is a numpy array of shape (m_train, num_px, num_px, 3). That is, there are m training examples (i.e. images), each of height and width num_px, for each of the three color channels. An analogous structure holds for the test set images (in test_set_x_orig). These are how the data are loaded. The “orig” suffix is attached to indicate that the data will be transformed into a form convenient for building a model (i.e. “preprocessed”)

Exercise 2 has you take care of the “preprocessing” of the downloaded data. Specifically, you"flattened" the downloaded numpy arrays, turning each image into a vector of shape (num_px * num_px * 3, 1) . In the training set, there are m examples so that the shape of train_set_x_flatten is (num_px * num_px * 3, m). Finally, the pixel intensity values are “normalized” by dividing each by 255. test_set_x_orig is transformed analogously (but there are fewer images).

But train_set_y and test_set_y do not need preprocessing since these are merely labels identifying whether an image in question if one of a cat (y = 1), or not a cat (y = 0). They are the outputs that your model attempts to predict by assigning a probability to each case. You can think of y=1 as a cat image with probability 1, and y=0 as a cat image with probability 0. These are the target outputs of the model. Thus, no “orig” suffix is needed for the downloaded y-data.

Topic		Replies	Views
Confused with Programming assignment Neural Networks and Deep Learning week-2	5	35	January 5, 2025
Logistic Regression with a Neural Network Mindset Assignment Neural Networks and Deep Learning	2	548	September 22, 2021
Some suggestions to improve the exercise “Logistic Regression with a Neural Network mindset” Neural Networks and Deep Learning week-2	5	44	February 2, 2025
Doubts with Practice Lab of week 2 Neural Networks and Deep Learning	3	777	July 18, 2022
W2_A1_Ex-1_NameError: name ‘train_set_x_orig’ is not defined Neural Networks and Deep Learning	37	1057	June 25, 2023

Week 2 : Logistic Regression with a Neural Network mindset

Related topics