In the context of Neural Architecture Search, the neural network (e.g. UNet) that we want to find the approximately best architecture has to be defined beforehand.
The question is how can we know if the network overall structure and its modules are defined correctly?
For example, if we want to search on UNet architecture, how to check if the residual block is defined correctly, hence the Encoder & Decoder of the UNet, regardless of more sophisticated features like Attention mechanism in the Unet?
p/s: aside from visualization using torchview, are there any other methods?
You can’t, other than by experimentation, evaluation, and application of your own skill and experience.
From your experience, could you please tell me how you do the checking, please?
- 1)Start with the simplest model.
- 2)Evaluate the performance (i.e. test set error)
- 3)If the results aren’t “good enough” for your needs, then make the model more complex, and repeat steps 2 and 3.
- 4)If you never get “good enough” performance, then try a different model entirely, and go back to 1).
The process becomes more efficient as you gain experience to guide your choices.
Thank you for your answer, but you might get my question in a different way.
Seems like what you’ve described is the whole process of hyperparameter optimization or neural architecture search. But my question is how to check the implementation code of the model class if each component is correctly implemented.
For example, I want to search the best depth, specification of residual blocks of UNet. What I really would like to check is that if my code implementation of the residual block is correct or not.
It’s no different from testing any piece of code. You have to feed it test cases where you know the expected results, and see what your code does.
If you’re just checking your implementation of an established method, you can often find the original author’s data set online - these are typically published along with the original paper so that others can check their work.
Typically in the published paper, they only test the performance of the whole architecture, not individual components. This is where I find it hard to check my code implementation. Could you please elaborate on the test cases for the individual components?
Sorry, my well is empty in the specifics.