C1 - W3 Exercise 7 copy.deepcopy

Why do we only deepcopy W1 and W2 and not the biases?

Hi, Metehan.

Well, the same discussion is followed on this thread as well. It provides a proper understanding on why we use copy.deepcopy for w.

This link digs deeper into the subject of shallow and deep copy.

thank you for your answer

It’s worth looking at this a bit more. This link is talking about Logistic Regression in MLS Week 1, which is a different case. In LR, the bias is a scalar, but w is a vector.

The question here is about the instructions and the comments in the template code in DLS C1 W3 Planar Data Assignment for the update_parameters function. That is a different case because both the W and b values there are numpy arrays. I claim that what they say in the notebook is slightly inaccurate. Whatever you do, you need to treat both the W and b values the same way. The problem is that python objects are passed “by reference” on python function calls. (Here’s another thread that discusses that in some detail.) That means that the variable parameters that is passed to update_parameters is an object reference to a global object, which happens to be a python dictionary. A dictionary is a data structure with references to the target objects, which are numpy arrays in this instance.

The difference between plain copy and deepcopy is explained by the second link Rashmi gave above.

Of course the fundamental point here is that update_parameters is going to change the contents of the dictionary and then return it as still a dictionary. We want to avoid modifying the global data in this case, because the same global data is used as the input for several test cases.

With that as prelude, I think the actual answer is that you have several choices here:

  1. The cleanest solution is just to do copy.deepcopy() on the parameters dictionary itself. As we know from reading Rashmi’s link, that will recursively copy everything, including duplicating the targets of the dictionary.
  2. We could individually do shallow copies of the elements of the dictionary, as in parameters["W1"] and parameters["b1"] and so forth. The reason is that numpy arrays are simple objects.
  3. You could do copy.deepcopy() on all the individual entries like parameters["b2"]. That is still correct, but it’s just slightly overkill and not really necessary in that case.

Just as a matter of programming style, it’s fine to always use copy.deepcopy() in a case like this. If the target object is simple and didn’t really need the “deep” copy, it does no harm other than executing a little bit more code to check for the deep case.

Thank you, Paul sir. This truly is something that I missed! Appreciate your help, always!