Hi, @NiveeNasr. First off, your post is in violation of the course honor code. You are not allowed to post your code; post only your traceback (which you did). Please be mindful of that in your future participation. Thanks!
The ValueError exception is telling you that your matrices in np.dot(w.T, X) are not of the right shape for matrix multiplication. It even points out what is required for a valid multiplication: that the number of columns of W.T must equal the number of rows in X. So which one is out of whack? Maybe both?
Not the latter; the model_test function executed in the test cell sets up X for you. So it is of dimension (4, 7). Now where is your misshaped w coming from? I am guessing that your model function ignores the shape of X_train, the first parameter of the model function. Check the call to function initialize_with_zeros within the model function.
Yes, Ken has given you exactly the right clue to find that error. Then once you fix that, the next problem you are going to hit is that your values will be wrong. The reason is that you are calling the optimize function incorrectly. I’m guessing you are fairly new to python and are not familiar with now “named parameters” work. What you have done is “hard-code” the values the number of iterations, learning rate and print flag when you call optimize. The way you have written the code, the actual values for those parameters which are passed into model will be ignored and the answer will always be the same. That will not end well.
That means you are still passing the wrong number of iterations to optimize when you call it from model. This is the point that I was making in my earlier reply: you were previously over-riding the value to be 50 iterations. But now you are over-riding it with 2000. So how did that happen?
Hello @kenb and @paulinpaloalto, I got the same error the original poster had in this thread but when I try to follow the answers given and change initialize_with zeros to take into account the shape of X_train, I get the below error:
My sense is that I’m still passing an incorrect argument to the initialize_with_zeros function, but I can’t figure out what the correct argument is. I can hardcode 4 to get farther into the exercise, but that both feels wrong and seems to lead to incorrect values for dw.
Hard-coding is always a bad idea unless they specifically tell you to do that. Well, it seems pretty obvious which variable they are complaining about: it’s dim. So what is the type and value of dim when you make that call.
Oh, actually, you can see the bug in the exception trace: you’re passing X_train[0]. That is taking a 2D array X_train and “slicing off” the first row of it. So that will give you a 1D array of floating point values. That is obviously not what was required there.
I think dim is an int and it’s set to 2 earlier in the assignment, so passing dim to dim would result in a dimensionality problem, 2 != 4.
Oh, hm. I thought passing X_train[0] would pass the first value in the shape tuple for X_train, but it instead just passes the first row of the values of the array, got it. Would the correct argument then be X_train.shape[0]? Doesn’t that break if X_test has a different shape?
Apologies for missing what may be obvious, I am new to Python and haven’t done any work with vectors for about a dozen years, so it’s slow going.
Yes, you’re right about using the first element of the shape tuple, not the first row or even the first element of the first row of X_train itself. As to the X_test question, well, they both have the same number of rows, right? It’s the columns that may be more or less between different parts of the dataset. The first element of the shape tuple is the number of rows and the second is the number of columns, right? Each column of a sample matrix (X_train or X_test in this case) is one sample vector with n_x elements, where n_x is the number of input features.
And the point about dim is that I was referring to the local variable within the scope of the initialize function. That’s got nothing to do with the global variable dim that might have 2 or some other value. Even though the name is the same, they are different. If you are new to programming and the concept of “scope” of a variable is unfamiliar to you, that would be something you would want to remedy by reading some python tutorials about that. If that term is new to you, let me know and I’ll scare up some links.
Thanks a ton for the in-depth explanation, I think the issue is I had the standard matrix notation (m x n) mixed up with the ML usage of m for the number of examples. So X_train’s shape is (num_pxnum_px3, m_train), and X_test is (num_pxnum_px3, m_test). Meaning, as you said, X_train.shape[0] = X_test.shape[0].
However, I am now running into the following error, meaning something is wrong the way dw is calculated, but I can’t put it together. Sorry for the slew of questions and I really appreciate your help!
That error is telling you that your w value is wrong, not dw. There could be many reasons for that, but my first suggestion is to look at how you call optimize from model. There should be no equal signs in the parameters that you pass to optimize. If you are new to python, you should read up on how “named parameters” work.