I’m looking at the code example where we make 2 layers, pass an input layer, get the activations from it and then pass it on to the next. However, I’m confused about how this happens. How does TensorFlow go from an input vector of 2 values to the first activation layer, with 3 values?
Not sure which lab it is because there are many of them in Course 2 Week 1. I will discuss without it but if it is important, please share the lab’s name.
When we define a Tensorflow layer, we tell it the number of neurons - here I suppose it was 3 neurons. When we train/use it, its build() function will be triggered to create a weight matrix (2D array) with the shape of (2, 3) - 2 rows and 3 columns. The 2 comes from the input shape which is 2 because you said 2 values (features). The 3 comes from 3 neurons. This array will be multiplied by the input matrix (array of shape (m, 2)) and the matrix multiplication of shapes (m, 2) and (2, 3) results in (m, 3).