Course 4, Week 1, Programming Assignment 2

Steven_Saito · March 19, 2022, 6:01pm

Hello,
I am curious why the batch size is different between the training statements for Happy model vs Sign model.

happy_model.fit(X_train, Y_train, epochs=10, batch_size=16)
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, Y_train)).batch(64)

Thanks!
Steve

paulinpaloalto · March 19, 2022, 7:22pm

You filed this under Course 3, not Course 4. I modified the title for you by using the little “edit pencil” on the title.

The minibatch size is what Prof Ng calls a “hyperparameter”, meaning a value that you need to choose as the system designer, as opposed to a “parameter”, which can be learned through back propagation. The best choice for a given situation is not always the same, otherwise there would be one “golden” value that everyone always uses. There are just some “rules of thumb”, e.g. the famous quote from Yann LeCun: “Friends don’t let friends use batch sizes greater than 32.” But apparently even that rule doesn’t apply in every situation.

paulinpaloalto · March 19, 2022, 7:25pm

If you came directly to Course 4 and skipped Course 2, it might be worth looking at some of the lectures in Course 2. The main focus of Weeks 1 and 2 of Course 2 is exploring different hyperparameters and discussing systematic ways of selecting them. I think it’s in the lectures in Week 2 where he introduces and discusses minibatch gradient descent.

Steven_Saito · April 3, 2022, 12:39am

Thank you! I wasn’t aware of the hyperparameter vs parameter distinction, but it makes sense. I did take Course 3 (got it confused with 4 - thanks also for correcting my post’s title).

I did take Courses 1-3, but I do find myself referencing then time and again, to remind myself.

Topic		Replies	Views
Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent' Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	569	October 27, 2021
Doubt in happy_model.summary() Shapes Convolutional Neural Networks coursera-platform	4	537	May 11, 2022
C2W2: Random_mini_batches Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	533	October 3, 2022
Week2 Exercise2- mini batches Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	539	January 19, 2022
week2:Optimization_methods Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	353	October 15, 2023

Course 4, Week 1, Programming Assignment 2

Related topics