Hi to everyone,
I’m stuck since days on this problem…
I know that the problem is the last mini_batch. It should be: (12288, 20) but I get (1, 12288, 64). I don’t really get how to use k in the for cycle.
considering the hint, I wrote: mini_batch_X = shuffled_X[:, k * mini_batch_size: (k+1) * mini_batch_size]. But it is not right.
Can someone help me, please?
Notice that your shapes are correct for the first 2 (complete) minibatches. It’s only on the last (incomplete) one that the dimensions are wrong. So compare the way you did the “slicing” logic in the case of the last partial minibatch to how you did it for the full minibatches.
In other words, if you’re having trouble finding the bug, that probably just means you’re looking in the wrong place. It should be very clear where the mistake is.
Thank you a lot for the reply.
Yes the mistake is on the calculation of: mini_batch_X (shuffled_X[:, k * mini_batch_size: (k+1) * mini_batch_size]) and mini_batch_Y. This is because I am considering ‘mini_batch_size’ for the last batch as well. I can’t think of a way to implement the “exception” on the last batch. Can you help me?
What is different about the “partial” batch case? The start value of the range is the same as in the “full” case, but the end value is different. But it turns out you can just say “start here, but give me everything after that point”, right? How would you say that in python?
But from your deleted post, I’m afraid that you’re still missing my point. You are showing the logic for the “full” case. That is already correct. The whole point here is that it’s the “partial” case that is wrong. That’s different logic, right?
Although actually now that I think about it, it is legal to index off the end of an array dimension in python, as long as you do it with a “range”. It will just give you all it can. So maybe that should have worked. But the original problem you show is that you really were indexing differently in the “partial” case so that you ended up with 3 instead of 2 dimensions.
Ah ok, so the for cycle is correct but the last ‘if’ is wrong. Right?
Copying the same code I have in the for cycle, I get this. So at least I don’t have 3 dimensions anymore. Now I have to implement the ‘stop’ at the last batch:
Well, I would be worried that your “start” values are also wrong somehow. If you’re using the “index off the end” strategy for the last batch, then why does it have 64 entries? You must be starting at the wrong index.
No I didn’t use the “index off the end” strategy yet. I’m thinking how to implement it. I’m overthinking for sure. I’m quite sure there is an easy way…
shape of the 1st mini_batch_X: (12288, 64)
shape of the 2nd mini_batch_X: (12288, 64)
shape of the 3rd mini_batch_X: (12288, 63)
shape of the 1st mini_batch_Y: (1, 64)
shape of the 2nd mini_batch_Y: (1, 64)
shape of the 3rd mini_batch_Y: (1, 63)
This is my code. I think there is probably a problem of comunication, this is the faster way of solve it.
I know I shouldn’t past it here, I will delete it as soon as I get the reply.
I would also love to tank you for the time you are dedicating to help me. I appreciate it. @paulinpaloalto
But the “for 10 bonus points” solution here is that it turns out you don’t even need to handle the “partial batch” case separately. Because of the property that I demonstrated earlier (that you can index off the end of an array as long as you use a “range” to do it), it would have just worked if we used the one main loop and make sure to loop over the last partial batch as well.