W2_A2_Ex-2_Reshaping the matrix

Sahngyoon_Rhee · January 3, 2024, 5:20pm

On Week 2: Logistic_Regression_with_a_Neural_Network_mindset Exercise 2, we are told to use the formula

X_flatten = X.reshape(X.shape[0],-1).T

on train_set_x_orig. However, when I saw this formula, I decided to use the simpler version

X_flatten = X.reshape(-1, X.shape[0]).

It seems to me that both of these would give the same result. However, I get an error when I applied the second one. Why is this the case?

pastorsoto · January 3, 2024, 5:49pm

Hi @Sahngyoon_Rhee great question!

The difference between the two reshape operations you mentioned lies in how they arrange the elements of the array in the reshaped matrix.

Let’s break down the two operations:

X_flatten = X.reshape(X.shape[0], -1).T:
- X.reshape(X.shape[0], -1): This reshapes X into a 2D array where the first dimension is X.shape[0] (the number of training examples) and the second dimension is automatically inferred (-1).
- .T: This transposes the resulting 2D array. Transposing swaps the rows and columns.
X_flatten = X.reshape(-1, X.shape[0]):
- This reshapes X into a 2D array where the first dimension is automatically inferred and the second dimension is X.shape[0]. There’s no transpose operation here.

In the context of neural networks, particularly in logistic regression, the usual convention is to have each column of X_flatten represent a training example. This is because neural network libraries often expect the input in this format, where each column is a feature vector for a single example.

X.reshape(X.shape[0], -1).T achieves this by first reshaping X with X.shape[0] as the first dimension and then transposing it, making X.shape[0] the second dimension, thus aligning each training example in columns.
X.reshape(-1, X.shape[0]), on the other hand, keeps the number of training examples as the second dimension without transposing, which misaligns the data for typical neural network processing.

The error you encountered is likely due to the fact that the reshaped array does not conform to the expected format for subsequent operations in the neural network, leading to dimension mismatches or incorrect processing of data.

If you want to go a little further print both approaches and check if there are differences there

print(X_flatten)

paulinpaloalto · January 3, 2024, 7:26pm

In addition to Pastor’s explanation, here is a thread which shows a way to visualize why the two methods are not equivalent. Just because the shapes end up the same is not enough.

Topic		Replies	Views
Flattening Images in the Logistic Regression Assignment in Course 1 Week 2 Neural Networks and Deep Learning week-2	6	2708	July 27, 2021
Check for understanding: Week 2 Exercise 2 Neural Networks and Deep Learning	1	698	May 29, 2021
Week 2, programming assignment, exercise 2 Neural Networks and Deep Learning	1	586	September 10, 2022
Difference between the code and why is my code wrong Neural Networks and Deep Learning week-2	2	14	December 12, 2024
C1W2 assignment2 exercise2 Neural Networks and Deep Learning	1	589	August 18, 2022

W2_A2_Ex-2_Reshaping the matrix

Related topics