C2_W1_Lab_2_gradient-tape-basics - one or two gradient tapes

I experienced with the following code to see if we can use just one tape here:

x = tf.Variable(1.0)

with tf.GradientTape(persistent=True) as tape_2:
   
    y = x * x * x

    dy_dx = tape_2.gradient(y, x)
        

d2y_dx2 = tape_2.gradient(dy_dx, x)

print(dy_dx)
print(d2y_dx2)

I got the right ouput but with a warning:
WARNING:tensorflow:Calling GradientTape.gradient on a persistent tape inside its context is significantly less efficient than calling it outside the context (it causes the gradient ops to be recorded on the tape, leading to increased CPU and memory usage). Only call GradientTape.gradient inside the context if you actually want to trace the gradient in order to compute higher order derivatives.
tf.Tensor(3.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)

Question: As our goal is that the tradient inside the context will be recorded in this case, for calculating the 2nd gradient later, is it a good pratice to use just one tape in this way, or it’s really better to use 2 tapes as we learned and avoid the warning?

I think its better to avoid the warnings and it gives the reason too, higher CPU and memory usage)!

1 Like

Why don’t we get the same working in this following case too?
Here also the first gradient calculation is within a block, aka the second block.

x = tf.Variable(1.0)

with tf.GradientTape() as tape_2:
    with tf.GradientTape() as tape_1:
        y = x * x * x

        dy_dx = tape_1.gradient(y, x)
        
    # this is also acceptable
    d2y_dx2 = tape_2.gradient(dy_dx, x)

print(dy_dx)
print(d2y_dx2)

OK, it’s not persistent … I got it.

1 Like