K-fold cross validation

Zahra_Ahmadi · April 18, 2022, 8:19pm

I use k-fold cross validation for my data. My problem is when I use it for one model I get bad result for every fold But when I determine for example for train It gets 80% first of the data It gets better accuracy. Is this normal problem? or I implement k-fold wrongly?

paulinpaloalto · April 18, 2022, 8:33pm

K-fold cross validation is not a topic that Prof Ng covers in this specialization (at least that I can remember). I googled it and found this explanation on Jason Brownlee’s website.

Having taken a quick look at Jason’s explanation, I think I have the basic idea of what K-fold cross validation means. But with that (perhaps sketchy) understanding, I am having trouble making any sense of your question. This is purely a guess on my part, but are you sure you understood the point that you start the training completely from scratch on each “fold”, right?

But I think the higher level point is that you need to give us more to go on here: please give a more detailed explanation of what you tried and what results you saw. Please include information about the nature of your dataset. Is it one of the ones from the assignments here? Or from somewhere else? How many total samples does it contain?

Zahra_Ahmadi · April 18, 2022, 8:41pm

Because of the name of k-fold, we named each of the training fold. I implement one model, It’s 3 layer. The first and second one is relu and the last one is softmax. When I split my data with train_test_split in sklearn library I give test 0.2 and It returns 99% train accuracy and 90% test accuracy. But when I use k-fold for this data, each of the folds return about 75% accuracy for train and 65% test accuracy.

paulinpaloalto · April 18, 2022, 8:44pm

That doesn’t sound like what I would expect. But what value of k did you use? That kind of matters, right?

paulinpaloalto · April 18, 2022, 8:45pm

To get the equivalent of 0.2 for train/test split, you would use k = 5, right?

Zahra_Ahmadi · April 18, 2022, 8:46pm

I use 5. No both are 20 80 but the result is different.

paulinpaloalto · April 18, 2022, 8:52pm

Ok, is anything else different about the various hyperparameters? E.g. number of iterations, learning rate, network architecture?

Are you using some “black box” scikit routine to do the “k-fold” logic? Or the non-k-fold?

paulinpaloalto · April 18, 2022, 8:55pm

The point is that the fundamental training operation is the same, right? It should only be a question of how you actually perform the data split.

But if you are just doing all this stuff by calling some black box library routine, then you have no control over any of it.

Zahra_Ahmadi · April 18, 2022, 9:20pm

Yes, All the part of the architecture is same but I don’t understand why this happen. Most my problem is that the split is 20 and 80 and the last fold of the data is split as same as this but it doesn’t have same result or close result.

paulinpaloalto · April 18, 2022, 11:22pm

Sorry, what do you mean “the architecture is the same”? Did you actually write and use the same code to execute the training in both cases? You didn’t really answer my question about whether you are using some sklearn API (which is a “black box”) to execute any part of this.

If the behavior is different, then there has to be a way to explain why that happens.

Surena_Soltani · April 28, 2022, 12:43am

When you are using the k-fold cross validation for tuning the hyperparameters, you are training your network on subsets of your training data instead of all your training data. Thus you end up with lower accuracy in training and testing. Using different values for k also lead to different results.

Topic		Replies	Views
K-Fold Cross-validation AI Discussions	1	459	August 27, 2023
Why the professor does not talk about K-fold cross validation? Advanced Learning Algorithms week-module-3	5	427	July 12, 2023
Coverage of k-fold cross validation and other splitting strategies AI Discussions	4	80	February 4, 2023
Cross-validation Error vs Generalization Error Advanced Learning Algorithms week-module-3	7	656	August 31, 2022
Prediction accuracy is related to input size Improving Deep Neural Networks: Hyperparameter tun coursera-platform	11	73	June 27, 2024

K-fold cross validation

Related topics