Inverted Dropout

paulinpaloalto · June 12, 2021, 2:46pm

Did you complete the experiment of dividing 42 by 0.8? What did you get? Now try dividing -0.573 by 0.8. What happens to its absolute value?

The 2-norm of a matrix is the square root of the sum of the squares of the elements of the matrix, right? It is the generalization of the Euclidean length a vector. The interpretation in more than 1 dimension is a bit more complicated than mere “length”, but you can think of it as a measure of the “magnitude” of a matrix. It does actually have a similar geometric interpretation to length when you consider matrices as linear transformations between two vector spaces.

So let’s just create a relatively small matrix with normally distributed values:

np.random.seed(42)
A = np.random.randn(3,4)
print("A = " + str(A))
print("2-norm(A) = " + str(np.linalg.norm(A)))

Running that gives this:

A = [[ 0.49671415 -0.1382643   0.64768854  1.52302986]
 [-0.23415337 -0.23413696  1.57921282  0.76743473]
 [-0.46947439  0.54256004 -0.46341769 -0.46572975]]
2-norm(A) = 2.672810732482017

Now let’s try multiplying by 1/0.8 and see what happens:

B = A * (1/0.8)
print("B = " + str(B))
print("2-norm(B) = " + str(np.linalg.norm(B)))
B = [[ 0.62089269 -0.17283038  0.80961067  1.90378732]
 [-0.29269172 -0.2926712   1.97401602  0.95929341]
 [-0.58684298  0.67820005 -0.57927212 -0.58216219]]
2-norm(B) = 3.3410134156025215

It’s easy to prove that ||m* A|| = |m|*||A|| where m is a real scalar and A is a real-valued matrix. If you check with your calculator, you’ll see that is what happened here.

1/0.8 = 1.25

Topic		Replies	Views
Regularization by Inverted Dropout Improving Deep Neural Networks: Hyperparameter tun	1	687	August 12, 2021
Inverted dropout Intuition? Improving Deep Neural Networks: Hyperparameter tun	3	671	May 24, 2022
A lecture issue in dropout regularization implementation in week 1 Improving Deep Neural Networks: Hyperparameter tun	7	713	December 9, 2022
Inverted Dropout step Improving Deep Neural Networks: Hyperparameter tun	2	625	February 12, 2023
Why do you divide the activations by keep_prob when you use drop Improving Deep Neural Networks: Hyperparameter tun	7	714	May 22, 2023

Inverted Dropout

Related topics