Tough zombies loss curve

this one is particularly challenging- i was able after a coding homework marathon get batch training functional - but Im a still long way from expected output - i think -SGD is native to this model -= i was able to get Adam to run a batch but having issues if i try the other optimzers im both familiar with and occurs to me it could be a fit for this architecture - im on my 8th attempt the best i’ve gotten down to is 1.25 so far - i’m sliding momentum and learning rate and batch size and num batches all over the place but i can seem to get below a buck yet - i saw the expected output on his first batch output his loss jumped to 16.06 before disintegrating to .0004159231 by batch 90- can we tweak outside of the four parameters we were encouraged to adjust? could i get an explanation as to why the expected output ehxibited that precipitous rise and fall - Laurence -in an earlier course -gave us an example in transfer learning that made that kind of one time spikes in the loss curve - but it doesn’t seem like we have the entire abbreviated arsenal at our disposal on this one

1 Like

Hello Paul,

your question seems you are not getting the desired accuracy? Am I right??


at first - was able to use an adam optimizer but prob shouldn’t have dug in that far into the model -now ive blown it up and i can’t get a new batch going -i was trying the 4 fine tuning parameters initially - now i cant resolve a shape error at calc loss after proving ground truth - i tried to redo from scratch i keep hitting wall there or at train step - i had it in the low 1.2 area as a loss but then my boxes were false so i think i fixed that - by going to provided ones instead of selecting my own - but having been able to run a batch since - any chance u could scan my notebook - im going a little insane - on the other hand ive really been in a very deep study of this method sharing weights from the retina model using them in this detection model - its gotta be something small but i cant find it yet and i would love to move on from this one

well its ok- no need i took a fresh notebook and a few hours later “voila” done and done - i can’t believe this is third time i asked for help and the solution unveiled in my next attempt - this one was a doozy to be extra campy - my learning journey has many forms this time its was a deep dive on one spot for days - actually what helped was reading your advice to another student about effort - and while you may put time in you have to have laser focus as well esp in a debugging episode and that took another layer of effort - kept my code as clean as i could - meticulously read every drop of criteria left no stone unturned and magically hit the figure on next batch - so thanks indirectly and no need to review my notebook this go around