Hi @James_Siddle,
I personally like to think of one filter (f x f x input_number_of_channels) as one neuron, because if I have 23 neurons in this Dense layer, then the layer will produce 23 features; similarly, if I have 23 filters in this Conv layer, then the layer will produce 23 feature maps.
So, Dense → Conv; neurons → filters; features → feature maps.
I know it is not the norm that people actually call a filter as a neuron, but I like to think of it that way.
Yes, and our mentor @hackyon has shared a paper in this post, which I think you may want to have a look.
Thanks for your amazing analysis.
Cheers,
Raymond