Training and validation loss not working in Pytorch

Scotley_Winata · June 13, 2023, 2:21am

hello, I am having trouble with the loss not working. basically all the validation and training loss are all zero. Does anyone know how to fix this code? I still had not much basic in machine learning coding. I also didn’t understand in using the loss functions like MSE loss, crossenthropy, etc.

paulinpaloalto · June 13, 2023, 2:32am

It looks like you are just feeding the input X and Y values to the loss function. It’s supposed to be the output of the model (\hat{Y}) versus the labels (Y), right?

In fact if that even worked (meaning did not throw a dimension mismatch error), then I would have some questions about how your criterion function is implemented. Normally the X values have multiple features, whereas the Y values (labels) typically don’t.

paulinpaloalto · June 13, 2023, 2:34am

You also mention that you don’t have much experience with ML coding. Have you taken any of the specializations here? MLS, DLS and GANs would be good background. Note that DLS uses TensorFlow, but GANs actually does use PyTorch. So that would be a good way to get more experience with torch. But I don’t think you can just start with GANs if you haven’t at least taken DLS C1.

Scotley_Winata · June 13, 2023, 2:38am

I took the Coursera ML specialization by Andrew. However, it only helped my understanding in the concept and not much of the codng

paulinpaloalto · June 13, 2023, 2:40am

Do you mean the original Stanford Machine Learning that used MATLAB? If so, you might find DLS C1 worth a couple of weeks. It uses python.

Scotley_Winata · June 13, 2023, 3:04am

I did put training and validation loader that contains X and Y in as well. However, I didnt understand why it doesnt work. My current biggest weakness is I cant code the training and validation loss from scratch.

I do not understand how to implement a certain data to the training loops. Most example in the internet was MNIST which came in a very different data type.

I got the criterion from templates I got the code from.

Scotley_Winata · June 13, 2023, 3:05am

its not using matlab, it was using the jupyter notebook with tensorflow. However, I did try some of the one with matlab. I did it with Octave. However, Pytorch feels really different from the very mathematical coding of tensorflow.

It is harder to know what is going on in Pytorch’s code than the much more bare bones Tensorflow

paulinpaloalto · June 13, 2023, 9:13pm

What is the criterion function that you are using? Is that a provided pytorch function?

Of course the loss function is critical to everything, so it’s important to make sure it’s appropriate for your application. What is the nature of your data? What is the prediction you need from your model? Is it a “classification” or is it a numeric answer (a regression problem)?

ai_curious · June 13, 2023, 11:29pm

https://nn.readthedocs.io/en/rtd/criterion/index.html

@Scotley_Winata can you show us the line where criterion is instantiated? Something like criterion = BCECriterion()

Scotley_Winata · June 14, 2023, 3:10am

I am currently using the torch.nn’s nn.MSELoss, but I think the problem is I put label and input instead of prediction/output and y1arr/target/input(?). what I do not understand how do I declare the output.

If you are wondering, the data I was using pretty much only require np.mean except for the going home and rain.

to be honest, I might as well post my gist here. I know how horrible these looks.

Scotley_Winata · June 14, 2023, 3:13am

sure,

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(modely1.parameters(), lr=learning_rate)
optimizer2 = torch.optim.Adam(modely2.parameters(), lr=learning_rate)

this is right above the training loop. Dont mind the optimizers, the program is going to predict 2 datas.

ai_curious · June 14, 2023, 12:03pm

This tutorial at the PyTorch site might help…

https://pytorch.org/tutorials/beginner/introyt/trainingyt.html

There is a section called The Training Loop with code that should help

Scotley_Winata · June 14, 2023, 5:32pm

Thanks, I think I might know what I did wrong now

Scotley_Winata · June 15, 2023, 3:55am

I might have found the real problem. Apparently the data just never got into the loop? or maybe theres something wrong during the reshape?

paulinpaloalto · June 15, 2023, 3:51pm

You don’t show us the definition of train_loader1. So one theory would be that you never actually execute any iterations of that loop. Put a print statement in that for loop and see what you find.

This is called “debugging” and it’s part of the job.

To make the point in a more concrete and actionable way: you don’t have to wonder what something does. You need to find out what it does by instrumenting it or the more sophisticated approach is to use a debugger and either single step or set breakpoints.

Scotley_Winata · June 16, 2023, 7:59am

I had sent the gist here. Anyways, thanks, the problem was most likely the train loader

paulinpaloalto · June 16, 2023, 4:03pm

Sorry, I did not look at your code in detail other than the original sample you posted here. My view is that it’s not our job to do your debugging for you. That was my point: if you are going to play these games on an ongoing basis, then the point is that you need to develop your own skills at this. And the best way is just to “do it”. The more hand to hand combat you do with code, the better you will get at it. The other way to state the point is to say that it’s a mistake to think of debugging as just an annoyance: it’s actually a key part of the job. You’re not really a programmer yet if you can just write code, but it doesn’t work, right? What’s the point if it doesn’t actually work?

I was trying to help at a “meta” level by pointing out methods to approach the problem. I think the famous proverb most commonly attributed to the philosopher Lao Tzu is the best way to think about this situation:

“If I give a man a fish, he will not be hungry today. But if I teach a man to fish, he will never be hungry again.”

Ai_curious and I won’t always be around to solve your problems for you, so you need to view this as learning to fish as opposed to hoping one of us will just hand you a tasty fish.

Scotley_Winata · June 16, 2023, 5:29pm

ok thx, its just I dont know whats going on with the codes sometimes. I can read and understand but writing it is different.

paulinpaloalto · June 16, 2023, 6:12pm

But the point is that when you don’t understand what a given piece of code is doing, then you need to figure out how to instrument it to understand that. Just one example is if you think a loop is not getting executed, set a breakpoint there or put a print statement in the loop. Wondering or considering what a piece of code might do is not useful. You need to figure out a way to see what it is doing.

Also note that there are probably better environments for developing python code than Jupyter notebooks. I have not used PyCharm, but I think a lot of people find that useful. There are others as well. You want something that has a good GUI with a debugger “built in”.

Scotley_Winata · June 19, 2023, 5:18pm

thanks for the advice!

Topic		Replies	Views
Object Localization MNIST lab, Tensorflow to Pytorch and losses doesn't decrease Advanced Computer Vision with TensorFlow week-1	1	541	July 2, 2022
Plotting loss curve from training loop AI Discussions	6	53	November 28, 2023
C1 w1 - unq_c6: Build Basic Generative Adversarial Networks week-1	3	477	June 29, 2023
Week 1 Assignment: RuntimeError Build Basic Generative Adversarial Networks week-1	7	856	February 15, 2022
Assignment C3W3 how to calculate adversarial_loss Apply Generative Adversarial Networks week-3	1	18	October 10, 2024

Training and validation loss not working in Pytorch

Related topics