Greetings Community,

Happy New Year!

I am having trouble trying to generate the exact same output, since it seems the third letter onward is where the discrepancy lies. I don’t see what I’m missing or doing wrong. Hopefully someone can guide me into the right path.

Thanks in advance,

Omar Brito

1 Like

Just in case, my Lab ID is smjumkgaiuhv.

Thanks again!

Hey @Omar_Said_Brito_Sala,

A very happy new year to you as well! Now, coming to your query, can you please DM me your implementation of the `sample`

function, so that I can try to help you out in finding out the exact issue.

By the way, only staff members can see your lab responses. So, until, someone explicitly requests you for your lab ID, it’s of no use to share it. I just wanted to inform you about this for future references.

Cheers,

Elemento

1 Like

Greetings Elemento,

Thanks for your help and for your suggestion.

This is my implementation of the sample function:

*{moderator edit - solution code removed}*

The mistake is that you have not handled the initialization of x in each loop correctly. The way you implemented it, you’ll end up with a vector that has multiple elements set to 1. It’s supposed to be a “one hot” vector, right?

BTW Elemento requested the sharing of code be done through a DM (Direct Message) rather than posting your code on the public thread. We aren’t supposed to share solution code in a public way. No harm done, since I have deleted it now, but for the future the way to send a DM on Discourse is to click the name of the person and then click “Message”.

1 Like

Hi PaulinPaloAlto,

Ok, I initialized x to have the shape of (27,100), and then start to set 1’s in each column in each index. It got closer, but it still throwing the error though. I think my softmax is providing different probabilities than those being evaluated.

Another situation I am seeing, I am always hitting the counter== 50, thus the ‘\n’ index gets appended. this means my probabilities are never pointing to index 0 before counter=50.

Anything else I need to consider?

Thanks again,

I sent you the code via DM.

thanks again for your big help.

The value of x is a vector of dimension vocab_size x 1, right? It is supposed to be a “one hot” vector representing one character in the vocabulary.

Yes, I set ‘x’ to np.zeros((vocab_size,1)), but still the output is different starting from the 3rd letter.

Did you set it that way in the loop also? Or just in the initialization code?

Oh ok, now I got it working. I had to declare the zero vector during the whole loop.

Thanks!

Right, that was the point of my earlier response. You were just copying x, so you end up with a vector that has more and more ones in it, instead of a “one hot” vector. That then gets fed back into the forward propagation and things go downhill from there.

In fact if you think about it, that bug would exactly cause a divergence on the 3rd iteration. On the first iteration, the input is all zeros. On the second iteration, you have only 1 “true” value set. Then in the third iteration you have two “true” values set in the input x value and thus you get a different answer in that iteration and from there forward.