Programming assignment for "planar data classification with one hidden layer" - is the test case for "backpropagation" faulty?

I don’t know why such things keep happening to me … :joy: :sob:

So I’m trying to program out the “backpropagation” in Week 3’s programming assignment.

The test case consists of two parts:

  1. A test that just prints results to STDOUT, and the student can then visually compare with what’s on the screen

  2. A test called backward_propagation_test() in public_test.py which compres the student-generated values against “shall-be” values using

assert np.allclose(output["dW1"], expected_output["dW1"]), "Wrong values for dW1"

etc.

It turns out that my code passes (1) (i.e. the output is as expected) but fails (2).

After trying out various things for a long time, it turns out that test (2) is really wonky.

It initializes all parameters to random values (but with the random seed set, so those random values are consistently the same on each run), as well as the cache, calls the user’s function once, then tests whether dW1, db1, DW2, db2 match its expectations.

However, this makes no sense:

    cache = {'A1': np.random.randn(9, 7),
         'A2': np.random.randn(1, 7),
         'Z1': np.random.randn(9, 7),
         'Z2': np.random.randn(1, 7),}

These values must be set to something from a valid feed-forward run!

Am I supposed to do that myself? Am I just confused?

The fun part is that the test case passes if one uses the A1 from the cache for computing the local derivatives of g[1], as indicated under “tips”. I did not do that. That’s why I ran into problems.

The test passes with

local_g1_derivatives = 1 - np.power(A1, 2) 
local_g1_derivatives = 1 - A1**2 

But the equivalent based on Z1 does not:

local_g1_derivatives = 1-(np.tanh(Z1)**2)

Althought db2 and dW2 match what is expected, db1 and dW1 do not.

P.S.

Here are some light fixes to the same backward_propagation_test(target)

Y is built like this:

    Y = (np.random.randn(1, 7) > 0)

This creates an array of booleans. Let’s keep the contract “Y is an array of numerics” as promised:

    Y = (np.random.randn(1, 7) > 0).astype(np.uint8) # keep promise to deliver numerics, not 

The tests should be ordered to examine the results that come early in the dataflow diagram first. This could give the student information that there are already problems with dZ2 for example. Otherwise if dW1 is tested first, it will obviously be wrong if dZ2 is already wrong, but the student will look at the wrong place. Hence:

    # Type and shape as expected?
    # Test the db2, dW2 first to give the user info about failure early in processing
    
    assert type(output["db2"]) == np.ndarray, f"Wrong type for db2. Expected: {np.ndarray}"
    assert type(output["dW2"]) == np.ndarray, f"Wrong type for dW2. Expected: {np.ndarray}"
    
    assert output["db2"].shape == expected_output["db2"].shape, f"Wrong shape for db2."
    assert output["dW2"].shape == expected_output["dW2"].shape, f"Wrong shape for dW2."

    assert type(output["db1"]) == np.ndarray, f"Wrong type for db1. Expected: {np.ndarray}"
    assert type(output["dW1"]) == np.ndarray, f"Wrong type for dW1. Expected: {np.ndarray}"
        
    assert output["db1"].shape == expected_output["db1"].shape, f"Wrong shape for db1."
    assert output["dW1"].shape == expected_output["dW1"].shape, f"Wrong shape for dW1."

    # Content as expected?
    # Test the db2, dW2 first to give the user info about failure early in processing

    assert np.allclose(output["db2"], expected_output["db2"]), "Wrong values for db2"
    assert np.allclose(output["dW2"], expected_output["dW2"]), "Wrong values for dW2" 
       
    assert np.allclose(output["db1"], expected_output["db1"]), "Wrong values for db1"    
    assert np.allclose(output["dW1"], expected_output["dW1"]), "Wrong values for dW1"
1 Like

Yes, this is a known problem with that test case. I filed a bug about this a while ago but it has not been addressed yet. The thread I linked is from Nov 2021 and I filed the bug in March 2023, when another student hit this issue.

If you consider the efficiency of the code, using A1 instead recomputing tanh(Z1) is way more efficient. :smile: After you went to all that trouble to avoid unnecessarily computing logarithms in the cost function, computing tanh involves not one but two exponentials, right? Well, to be completely correct, the second exponential is not really necessary since it could be computed by division (multiplicative inverse), but you don’t really know what the np.tanh code does.

2 Likes

Also note that they literally wrote the code out for you and explained why it is as it is in the instructions:

Yes, the test case is wrong, but this may explain why they don’t seem to put very much priority on fixing this particular bug. In terms of actual frequency of occurrence, I would estimate the rate at which students hit this issue is about 1 per every six month period averaged over the 7 years that this course has been live. And literally thousands of students have taken the course in that time.

I did not realize this, too, until this thread. I also think that the test only serves to find out if the exercise is completed as instructed (in Paul’s screenshot). I mean, David, even though I agree that 1-(np.tanh(Z1)**2) should have worked, these tests are not intelligent enough and they have limitations. I do not mean that limitation cannot be addressed, only it has not been addressed.

David, I think this is a good example that how a dataflow diagram or such mindset could help! You pointed out that Z1 and A1 should be related and not both randomly initialized, because Z1 → A1, and you could justify your flow of checking the derivatives by their dependency. :raised_hands: :raised_hands: :raised_hands:

Raymond

1 Like

I know, I just missed that trick at first.

I am sorry to say that I find that behavior of whoever the fixer-upper team is extremely
unprofessional. That should be fixed the day after. It’s literally adding one single call.

I have spent several hours trying to find what I didn’t understand about the formulas.

If the test is run, the assumption should be that at least one feed-forward run has been performed so that the cache is valid, or at least one should tell the student that he/she should do that him/herself.

It says backward_propagation_test() after all, not random_input_test().

Well, it should. :sweat:

Oh well. Also got two timeouts on the grader (I suppose that’s what is mean by “keyboard interrupt”) but the third attempt passed, so there is that.

A couple of informational points worth mentioning:

  1. The mentors have no employment relationship with either DLAI or Coursera. We can only file bugs, but cannot modify the course materials. We are fellow students and we are volunteers, meaning we do not get paid to do this.
  2. This change also involves a change to the grader, because it also fails your logically correct code. I have zero visibility into how the grader platform works, but it is provided by Coursera and my observation is that the course staff is very reluctant to make changes that involve changing the grader behavior. One can only speculate that “There Be Dragons …”.

Well, as I pointed out, the code was literally written for you in the instructions, including the explanation for why they wrote it that way. So this is the classic case that “saving time” by not reading the instructions carefully ends up not being a net savings of time. Two minutes saved and several hours wasted. :scream_cat:

1 Like

To risk the dragons involved in fixing bugs that require modifying the grader, the reward must be extremely high.

Fixing an issue with a very low rate of occurrence is not a high priority, when ranked against the other tasks DLAI staff performs (like creating entirely new courses for the ever-evolving Deep Learning field).

Everything you need to pass this assignment is available either in the notebook itself (via the hints), or by asking questions on the Forum.

2 Likes

David, if I were you, which by the way I was once like you when I first took the older version of DLS myself years ago, I think the best strategy moving forward is to keep your curiosity, your approach, your style, and challenge anything on the way, but also knowing that the graders and the tests have certain limitations.

We might have different perspectives sometimes, you as a learner, we as learners, we as mentors, and we as course testers. However, despite that difference, we are all here to discuss your idea and perhaps even our ideas out of that limitation of the graders, because ultimately, what’s important is that, in the real world, we do it in the best way we can without that kind of limitations. I think we just pick up what we can from the courses, make it ours, and then we move on :wink: :wink:

Cheers,
Raymond

1 Like