C4W2 Ex1 Residual_Networks quick question copy vs. =

inposition · February 18, 2022, 10:31am

In a previous lab (C1), we used .copy / deepcopy to save an input, using numpy I think maybe just python.

In this lab, the code provided just uses ‘=’, like this

Save the input value. You’ll need this later to add back to the main path.

X_shortcut = X

We then proceed to change X.

Why didn’t we need to deepcopy X into X_shortcut here? is it because it’s a tensor, and tensor flow does that automatically?

arosacastillo · February 18, 2022, 2:20pm

Hi Inposition,

Good question. As you are guessing it is related with Tensorflow. For the purpose of this exercise, assignment is just fine and only in certain cases we might need to use tf.identity. This thread explains it very well:
python - Tensorflow: what is the difference between tf.identity and '=' operator - Stack Overflow.

Happy learning,

Rosa

paulinpaloalto · February 18, 2022, 3:53pm

This is an excellent question! I’m embarrassed that I never thought of this before when examining that code. But I think we need to go one step beyond the link that Rosa gave us. What that StackExchange thread shows is that TF works the same way python and numpy do: after that assignment statement, X_shortcut is in fact a reference to the same memory object as X. What saves us is that all the following statements where we change X look like this:

X = <the return value of some TF/Keras function>

What happens when TF evaluates the RHS of that assignment is that it ends up allocating a new memory object for the result. And then when it performs the assignment, X ends up referencing that new memory object instead of whatever it was pointing at before that statement.

So what saves us is that none of the changes we make to X are “in place” operations.

paulinpaloalto · February 18, 2022, 5:15pm

In case anyone is not familiar with the problem that Sean is referring to in the OP of this thread, here’s an earlier thread that explains the issue.

Topic		Replies	Views
What is np.copy actually doing? Introduction to TF for Artificial Intelligence ... week-module-3	1	607	August 4, 2022
Why tensorflow does not give the same result Improving Deep Neural Networks: Hyperparameter tun week-module-1 , week-module-3 , coursera-platform	9	264	April 3, 2024
Week2 Programming Assignment w=copy.deepcopy(w) Neural Networks and Deep Learning coursera-platform	1	726	June 6, 2021
Why the use of np.array instead of tf.constant? Advanced Learning Algorithms week-module-1	2	550	June 19, 2022
How should I save the finished assignment copy? Convolutional Neural Networks in TensorFlow week-module-1	1	510	August 10, 2022

C4W2 Ex1 Residual_Networks quick question copy vs. =

Save the input value. You’ll need this later to add back to the main path.

Related topics