In a previous lab (C1), we used .copy / deepcopy to save an input, using numpy I think maybe just python.
In this lab, the code provided just uses ‘=’, like this
Save the input value. You’ll need this later to add back to the main path.
X_shortcut = X
We then proceed to change X.
Why didn’t we need to deepcopy X into X_shortcut here? is it because it’s a tensor, and tensor flow does that automatically?
Hi Inposition,
Good question. As you are guessing it is related with Tensorflow. For the purpose of this exercise, assignment is just fine and only in certain cases we might need to use tf.identity. This thread explains it very well:
python - Tensorflow: what is the difference between tf.identity and '=' operator - Stack Overflow.
Happy learning,
Rosa
This is an excellent question! I’m embarrassed that I never thought of this before when examining that code. But I think we need to go one step beyond the link that Rosa gave us. What that StackExchange thread shows is that TF works the same way python and numpy do: after that assignment statement, X_shortcut is in fact a reference to the same memory object as X. What saves us is that all the following statements where we change X look like this:
X = <the return value of some TF/Keras function>
What happens when TF evaluates the RHS of that assignment is that it ends up allocating a new memory object for the result. And then when it performs the assignment, X ends up referencing that new memory object instead of whatever it was pointing at before that statement.
So what saves us is that none of the changes we make to X are “in place” operations.
In case anyone is not familiar with the problem that Sean is referring to in the OP of this thread, here’s an earlier thread that explains the issue.