What's the difference between "optimizer.minimize" and "tape.gradient and optimizer.apply_gradients("

Hi friend,

so, in the class of " TensorFlow", professor showed 2 example, in the 1st example, it wrote

def train_step():
    with tf.GradientTape() as tape:
        cost = w ** 2 - 10 * w + 25
    trainable_variables = [w]
    grads = tape.gradient(cost, trainable_variables)
    optimizer.apply_gradients(zip(grads,trainable_variables))

in the 2nd example, it wrote

def training(x, w, optimizer):
    def cost_fn():
        return x[0] * w ** 2 + x[1] * w + x[2]
    for i in range(1000):
        optimizer.minimize(cost_fn, [w])
    return w

So, this optimizer.minimize(cost_fn, [w]) simply did everything. My question is what this optimizer.minimize ? Why we don’t use the

    grads = tape.gradient(cost, trainable_variables)
    optimizer.apply_gradients(zip(grads,trainable_variables))

I know I need to understand the tensorflow more, but can someone just give me a little bit “preview” about it ? thank you!

Please see this link.