Hi,
I’m having trouble with the gradientDescent function, I pass 4 and fail 4 tests, and I think the code is correct, I can’t finde any issues. Can someone help me review this please?
Hi,
I’m having trouble with the gradientDescent function, I pass 4 and fail 4 tests, and I think the code is correct, I can’t finde any issues. Can someone help me review this please?
If the tests fail, that means your code is not correct. We cannot directly see your code. The first thing to try is to show us the output you are getting when you run the tests and any other test cells for that section. Not the code please, just the output for now.
I get this output for the first function check
The cost after training is 0.67401311.
The resulting vector of weights is [1.8e-07, 0.00028977, -0.00010529]
I get this output for w1_unittest.test_gradientDescent(gradientDescent)
Wrong output for the loss function. Check how you are implementing the matrix multiplications.
Expected: 0.6709497038162118.
Got: 0.6740131131542673.
Wrong values for weight's matrix theta. Check how you are updating the matrix of weights.
Expected: [[4.10713435e-07]
[3.56584699e-04]
[7.30888526e-05]].
Got: [[ 1.80143528e-07]
[ 2.89771918e-04]
[-1.05294021e-04]].
Wrong output for the loss function. Check how you are implementing the matrix multiplications.
Expected: 6.5044107216556135.
Got: 1.9950223354133323.
Wrong values for weight's matrix theta. Check how you are updating the matrix of weights.
Expected: [[ 9.45211976e-05]
[ 2.40577958e-02]
[-1.77876847e-02]
[ 1.35674845e-02]].
Got: [[ 9.87948472e-05]
[ 1.10059736e-01]
[-8.14303704e-02]
[ 4.34224218e-02]].
4 Tests passed
4 Tests failed
Ok, your values differ in all the test cases, so there is definitely something different about your code. I added a print statement to my code to show the value of m (number of samples) and here’s what I get for the first function test cell:
m = 10
The cost after training is 0.67094970.
The resulting vector of weights is [4.1e-07, 0.00035658, 7.309e-05]
But the m is not one of the complex parts, so that’s probably not where the problem lies. Worth a quick check, though, to make sure.
Beyond that it’s a matter of carefully comparing the code you wrote to the instructions and the math formulas that they gave you for the various steps here. Please do that one more time. BTW I assume that your sigmoid function passes the tests earlier in the notebook. That’s one part of the computation here, of course.
If carefully examining your code doesn’t yield anything, then it’s probably time for me to look at your code. We can’t do that in a public thread like this, but there is a way. I’ll send you a DM (direct message) about how to proceed. DMs show up in your message list, but with the little “envelope” icon.
Facing the same issue. Could you share explanation here also? Thank you.
EDIT: I found the solution for my case.
To close the loop on the public thread here, the issue was that the z and h values were calculated as scalars (the sum over the samples). We need to maintain them as vectors, so that we can see the separate predictions per sample.