Hi, @Marios_Constantinou!
When it comes to training deep learning models, there are a couple of thing that to be taken into account to speed up the process. First, I am going to discuss the most common bottlenecks in the overall pipeline:
- Loading data from disk for each batch: if all the data is pre-loaded in RAM, it is much faster, although this is not always possible due to memory restrictions. If this is the case, make sure the loading process is optimized.
- Evaluation process may take some time. It might be a good option to just evaluate after several epochs, not in every single one.
- Save model on each epoch. Similar to the previous one.
Assuming everything else is optimized, the number of parameters is not the only thing that matters for model performance. You have to consider how many FLOPs (Floating Point Operations) it needs for each single forward pass and how paralellized this ops are (throughput). Check Table 1 of Gao et al. for reference.