In week 1 in the inference forward prop lecture a dot product is computed using the activations of the previous layer and the weights of the next layer, but the layers have different sizes. Does that mean the weight in the next layer have a different cardinality than the number of activations in the layer?
I guess it would be clearer if I could see in matrix form a representation of the weights in each layer.
Since the weight matrix is the connection between two layers, its number of inputs is the units in the previous layer, and the number of outputs is the units in the next layer.
So if you have 3 units in Layer N and 5 units in Layer (N+1), the size of the weight matrix would probably be (5 x 3).
Or it might be (3 x 5), it depends on what convention that assignment uses. There is no universal agreement on the orientation of a weight matrix.