Neural Network_input shape_Batch size vs Features

Anish_Sarkar1 · April 8, 2023, 7:39am

In this course the shape of a neural network is specified as (features, training example) so when implementing neural network in tensorflow, why does the input shape is taken as (batch_size, features) . Isn’t it inconsistent?

balaji.ambresh · April 8, 2023, 8:15am

Please follow the conventions followed by the framework you’re using. When it comes to tensorflow, the input_shape parameter should specify the dimension of a single training example. Batch size should not be included as part of the shape information. Upon printing model.summary(), you’ll notice that None is included as the 0th element in the shape dimension which represents the batch dimension.

Anish_Sarkar1 · April 8, 2023, 8:21am

Yes exactly, the batch size comes later. But shouldn’t it be (features, None) instead of (None, features)?

balaji.ambresh · April 8, 2023, 8:23am

Batch is the 0th dimension in tensorflow. Shape is therefore [None, features]

Anish_Sarkar1 · April 8, 2023, 8:28am

Got it, but my question was different. Andrew told us to take the features as rows and training examples as columns, i.e for a dataset of nx features and m training examples, the input layer should be of shape (nx,m) because we are stacking up different training examples column wise.
So when implementing neural network in tensorflow, shouldn’t the input shape be of (features, training example per batch)? Therefore shouldn’t the input shape be (features, none)?

balaji.ambresh · April 8, 2023, 8:32am

Which lecture and timestamp are you referring to?

Anish_Sarkar1 · April 8, 2023, 8:40am

In the 1st course of Deep learning specialization(Neural networks and deep learning), week 3 , Vectorizing over multiple training example at 4:24

balaji.ambresh · April 8, 2023, 8:53am

The lecture teaches how to perform vectorization. Andrew focusses on vectorization based on the data layout he uses.

Tensorflow expects batch to be the 0th dimension.

Anish_Sarkar1 · April 8, 2023, 10:06am

Thank you for your clarification but Andrew also taught to use features as rows and number of training examples as columns at the beginning of the course itself, although you are correct that in the lecture that i mentioned earlier, Andrew was indeed talking about vectorization .

paulinpaloalto · April 8, 2023, 5:00pm

Throughout Course 1 and Course 2 up to Week 3 of Course 2, Prof Ng uses the features x samples orientation of the data. Then in the middle of the C2 W3 assignment, right at the point that he needs to call a TF loss function for the first time, he needs to add a transpose to get samples x features.

So the high level point is that this is simply a choice that you can make either way, but of course choices have consequences. My guess is that Prof Ng uses features by samples in Course 1, because we’re writing the code by hand in python and it just works out more cleanly with that orientation. But then, as Balaji pointed out, TensorFlow has made the opposite choice, so we need to shift gears when we get to the point of using TF as our primary implementation mechanism instead of hand-coding the core algorithms in python.

TMosh · April 8, 2023, 10:10pm

All datasets and tools are not constructed identically.

Topic		Replies	Views
Doubt in happy_model.summary() Shapes Convolutional Neural Networks coursera-platform	4	539	May 11, 2022
C2_W1_Lab02_CoffeeRoasting_TF - Shape of tf.keras.input Advanced Learning Algorithms week-module-1	1	499	September 4, 2022
tf.keras.Input(shape=(400,)), #specify input shape Advanced Learning Algorithms week-module-2	3	494	February 1, 2023
W4_Video 3_Doubt in input shape Neural Networks and Deep Learning coursera-platform	3	445	July 18, 2023
Week3 Programming Exercise, Section 3.3 Train the Model Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	538	April 12, 2023

Neural Network_input shape_Batch size vs Features

Related topics