How is filters determined?

model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 150x150 with 3 bytes color
tf.keras.layers.Conv2D(16, (3,3), activation=‘relu’, input_shape=(150, 150, 3)),
tf.keras.layers.Conv2D(32, (3,3), activation=‘relu’),
tf.keras.layers.Conv2D(64, (3,3), activation=‘relu’),
# Flatten the results to feed into a DNN
# 512 neuron hidden layer
tf.keras.layers.Dense(512, activation=‘relu’),
# Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class (‘cats’) and 1 for the other (‘dogs’)
tf.keras.layers.Dense(1, activation=‘sigmoid’)

Hello, in the above model how was the filter(first parameter) determined? I understand it should increase each layers but why 16, 32,64?

it is usually started using filters as 32, 64, 128 which didn’t get the desired training and validation accuracy of 80%. Basically what dimensions are present for the images can affect the number of filters, and is more related to number of feature you want the training model to catch. The higher number of feature, higher the filter. the basic idea of choosing this filter of 16, 32, 64 is when we try different kind of configuration until we get the better accuracy.

so basically we keep compiling or modelling the filters layers based on how better our model is training. You can try for yourself and see with different filters, how accuracy varies.

Read this article. Explains the cats vs dogs classification


I think there are two generalizable takeaways from this explanation.

  1. There is now a history of 2 decades of research on CNN architectures for image classification, going back to AlexNet (2012) and even earlier. We’re not at a blank slate, so it makes sense to start with an architectural pattern that has been shown to work on other, hopefully similar, problems.

  2. Evaluate the performance of that baseline architecture against the new dataset and other functional and non-functional requirements (eg does it need to run on an edge device in near-real time). Iteratively and incrementally adjust parameters and measure resulting impact until ‘good enough’.

This process has been the way deep learning solutions have been developed for a number of years. Maybe now generative AI is changing that, as fewer people will be designing and training new models, and more will be shaping model ‘prompts’ instead.

But Kevin do you agree about trying to create models just for fun or learning. I have model designs where you learn more than having fun into making but the end result while shaping it does give a better results but most of the models still are not perfect. Does it mean artificial intelligence can never replace human intelligence?

When Prof Ng used to explain about filters I used to imagine those smaller steps that gives me better view at each step as I came closer to my destination, and that is how I could relate to his videos especially climbing the mountain example.

In my opinion the solution space of possible architectures is so large, one should never start completely from scratch and just make something up. Rather, you start with something that works, maybe not perfectly, then try to reason about what could be done to improve performance or overcome a limitation. To me this is imperative for people early in their ML journey. I believe the number of architectures that won’t work is profoundly larger than the number that will work, so without starting at a reasonable baseline one could wander aimlessly in a sub-optimal space for a long time. When you’re learning to cook, you start out using recipes until you get a sense of how ingredients can be combined…you don’t just go to the pantry and grab random stuff. That comes later :woman_cook: