I am referencing the C2 W1 Assignment but it’s not related to the assignment and I have finished the assignment anyway.
So in C2 W1 assignment, there is a neural network that classifies handwritten digits and for the input the images are collapsed to induvidual pixels, i.e from 20 x 20 image to 400 x 1 matrix of pixels. Does this loose positional data of pixels?
And with input shape set to (400, ) the number of parameters are as follow
Model: "my_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_3 (Dense) (None, 25) 10025
dense_4 (Dense) (None, 15) 390
dense_5 (Dense) (None, 1) 16
=================================================================
Total params: 10,431
Trainable params: 10,431
Non-trainable params: 0
I am practicing coding all the concepts learned till now in week 1 from scratch to get better understanding and am using “mnist” datasets I found in tensorflow tutorials and it has input images of digits with 28 x 28 resolution. I wanted to check what will happen if I create a model with input layer size set to (28, 28) and this is the model summary I got
Model: "3_layer_model_Sigmoid"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
layer_1 (Dense) (None, 28, 25) 725
layer_2 (Dense) (None, 28, 15) 390
layer_3 (Dense) (None, 28, 10) 160
=================================================================
Total params: 1275 (4.98 KB)
Trainable params: 1275 (4.98 KB)
Non-trainable params: 0 (0.00 Byte)
____________________________________
Forgot to mention but all layers in both the model I mentioned have sigmoid as activation. So now I am having hard time understanding those output shapes for layers. What does vectorization with multi dimensional input data look like. Maybe if I understand vectorization for single linear regression unit with multi dimensional input features then I might get better understanding or am I totally off the track with this whole thing?
Edit: I think for images maybe I should flatten like in the TensorFlow tutorial but why does the dense layer accept multi dimensional inputs, where is this used?
Also, tf.keras.layers.Flatten() is just image processing/data processing layer right?, It comes under data manipulation rather than part of machine learning algorithm right? Also, can we and do we use these data manipulation layers inbetween dense layers? Sorry if the question getting too large. Will remove the question if that is the case.