Hello,
I am currently studying image-to-image translation using Pix2Pix. I have created a virtual dataset with paired features and labels using custom coding, and I trained this dataset with Pix2Pix. When I applied the test dataset’s features to the trained Pix2Pix model, it was able to predict labels that showed similar patterns to the ground truth. However, I found the predictions to be insufficiently accurate, so I decided to improve the model’s accuracy by modifying the loss function, specifically the pixel distance loss (L1).
My goal is not to accurately predict every area of the feature image but to focus on predicting specific parts accurately. Therefore, I modified the code to calculate pixel distance loss only for those specific areas and trained Pix2Pix accordingly. Unfortunately, the results did not show significant improvement in prediction accuracy compared to the original model. Thus, I would like to ask the following questions:
- By calculating the L1 loss only for specific areas in the pixel distance loss, should I expect more accurate results for those specific areas?
- Does the pixel distance loss in the Pix2Pix loss function significantly improve the accuracy of training compared to using only the BCE loss?
- What other methods can I try to improve the prediction accuracy of the Pix2Pix model in my case?
Thank you for reading this long text