About the gradient penalty

River-Blue · December 17, 2023, 5:00pm

Question 1:

When we calculate the norm of the gradient
“gradient_norm = gradient.norm(2, dim=1)”
what does the parameters “2” and “dim=1” means?
Question 2:

why do we do "gradient = gradient.view(len(gradient), -1)“？
I think tesor.view(len(tensor), -1) yields the same size as the original tensor, please tell me where I understand wrong.

Thank you!

paulinpaloalto · December 17, 2023, 6:18pm

The comments in the template code answer most of your questions. Here’s that section with the comments:

    # Flatten the gradients so that each row captures one image
    gradient = gradient.view(len(gradient), -1)

    # Calculate the magnitude of every row
    gradient_norm = gradient.norm(2, dim=1)

The point is that we are dealing with 4D tensors here. The purpose of the view() there is to “unroll” it into a 2D tensor. For question 1), the 2 there means you want the “2-norm”, which is the Euclidean length if the input were a vector. The dim says you are treating each row of the flattened tensor as a separate input computing the Euclidean length (2-norm) of each row. So the result will be a 1D tensor with the number of entries equal to the number of rows. Here’s a little snippet of code to show the behavior:

foo = torch.zeros(256, 3, 16, 32)
print(foo.shape)
print(f"len(foo) {len(foo)}")
viewFoo = foo.view(len(foo), -1)
print(viewFoo.shape)
normViewFoo = viewFoo.norm(2, dim=1)
print(normViewFoo.shape)

Running that gives this output:

torch.Size([256, 3, 16, 32])
len(foo) 256
torch.Size([256, 1536])
torch.Size([256])

Topic		Replies	Views
GAN Course1 week 4: dim Build Basic Generative Adversarial Networks week-1	2	553	February 21, 2022
Gradient descent for linear regression Neural Networks and Deep Learning	1	522	November 1, 2021
C1W3 WGAN - taking gradients? Build Basic Generative Adversarial Networks week-3	2	363	February 9, 2022
Parameters and regularization note Generative Deep Learning with TensorFlow week-1	8	394	September 26, 2023
Np.linalg.norm does not count L2 Improving Deep Neural Networks: Hyperparameter tun	10	570	April 1, 2023

About the gradient penalty

Related topics