Is there any particular reason that ‘accuracy’ is the metric used?
There is a slight class imbalance in the image database provided, so isn’t it a good practice if we use different metrics (like F1 score or precision-recall)?
Also is it a good idea to use more data augmentation in a class where fewer examples are present to make the classes balance?
metrics is a list. Here are the options you can use or write your own.
Augmentation helps widen distribution of training space. This allows the model to better cope with test images that were not provided as part of the original training dataset.
Sampling / adjusting class weights are also used to address class imbalance.
Slight imbalances are not a problem.
If you have sufficient number of examples of each class, you can get good results with as little as 10% of each label.
The other general point to make is that you can view F1 or precision/recall as just elaborations of accuracy. Accuracy is the fundamental thing and those other ways of looking at it can be thought of as the first few steps in “error analysis”. In other words, if you have an accuracy problem, can you say anything more specific about it? E.g. is it a problem with lots of false positives? Or lots of false negatives? On particular classes or in general? The answers to those questions then drive your decision making about how to approach fixing the problem: do I need more training data of a particular kind or is my model underfitting or …
BTW the issues that I mentioned there are covered in detail in Course 3. A lot of people skip that one because it doesn’t include any programming assignments, but it’s worth a look if you want to go deeper on the kind of question you are asking here.