Hello,
I was wondering if transfer learning could help finding a good approximation on some key hyperparameters or initialization of the network when the number of pre tuning samples at hand is not much greater than the fine tuning ones. Has it been tried already ?