DLS_C1_Transforming X & Y to get correct dimension

I want to test neural network implemented at the end of the first course on my own example (factorial prediction) and for this purpose I pass 1-dimensional array of inputs X with shape (m, 1), where m = inputs.len() and output Y with same shape.

And I got error in calculations: at the step of backpropagation I got dW with shape (11, 11) (m = 11). But W shape is (1, 3). So I cannot do W = W - learning_rate * dW.

I am pretty sure I need to transform X and Y in some way to let them have another shape (X should be (Nx, m) and Y should be (1, m)). But I am not sure about X. Could somebody explain how to transform X and Y correctly, considering that initial data is next:

X = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]
Y = [1., 1., 2., 6., 24., 120., 720., 5040., 40320., 362880., 3628800.]

In your given example, X and Y shapes are (11,). If you want to follow the same equations as Prof. Andrew showed in this course, then you have to make them (1,11) by reshaping them like X = X.reshape(1,-1), same for Y. Here, your data is (n,m) where n is the number of features and m is the number of examples. You can grab m likem = X.shape[1]. And it is better to use numpy array like X = np.array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

Best,
Saif.

After I did this I got an error:

ValueError: shapes (20,11) and (1,11) not aligned: 11 (dim 1) != 1 (dim 0)

It’s in the forward propagation method on the line where I calculate Z

Double-check how you are initializing W and b.
W = (number of neurons of the current layer, number of neurons of the previous layer [or number of features, in case of layer 1])
b = (number of neurons of the current layer, 1)

Yep, it is correct. The code I use I got from code I already implemented. It works with images dataset (training dataset). But not works with my X and Y.

This is layer_dims I use bts:

layers_dims = [11, 20, 7, 5, 1]

So regarding error from my prev post:

ValueError: shapes (20,11) and (1,11) not aligned: 11 (dim 1) != 1 (dim 0)

20 is units amount on first hidden layer, 11 is input length

Your layers_dims is wrong. The first one should be the number of features (which is 1 in your example).

1 Like

Now I have AL (L_model_forward method) contains only nan items. I would like to debug a bit to get why this happen there

Go ahead. If you completed all the assignments of this course, I am sure you will figure it out on your own.

1 Like

What are you using as the loss function? What you are doing is a regression, not a categorical prediction, so the cross entropy loss is not going to work, right?

I’m using cross-entropy (the loss function from training example of the course), but now I understand it is wrong and I need smth different (probably, MSE ?). Please correct me if I wrong

Yes, MSE. Also, you have to change the activation function and derivative of the last layer. Changing the loss function is also required to change the derivative, right?

1 Like

Could you please also explain, why first layer dimension is 1 ? Why it cannot be same as length of the input dataset ?

There are two different dimensions of the input: number of features (the first dimension) and number of samples (the second dimension). So how many features do you have? Just a number, right? And there are 11 of them. Or if you included 11!, then there would be 12 of them.

I have a feeling that neural network is not a good choice to predict factorial value. In my case it not work as expected considering I tried different activation functions and different params.

Yes, that would have been my expectation, but sometimes you get surprised when you try something.

NNs are good at pattern recognition, but it typically requires a lot more than O(10) or even O(10^2) samples of training data to learn to detect the patterns.

But it’s still always a learning experience when you try to apply what we’ve learned in the course to a new problem. E.g. in this case, you learned how to use an MSE cost function and the effect that has on back propagation. So the effort was definitely not wasted. Onward! :nerd_face:

1 Like

Could you also advice, how to transform dataset correctly ? For example, I get some datasets from kaggle.com and found there not only numbers, but also words. I guess I need to do some extra actions on words to get dataset with only numbers filled ?

You can use .drop to not include a particular column or row in your data set. Below is the example from ChatGPT:

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3],
        'B': [4, 5, 6],
        'C': ["word1", "word2", "word3"]}
df = pd.DataFrame(data)

# Drop a column
df = df.drop('C', axis=1)
print(df)

1 Like

I think Kaggle has some null values too. You can use .isnull and .fillna, if you need.

1 Like