I saw that in the assignment we have the following piece of code twice:
any_word = list(word_to_vec_map.keys())[0]
There is also a warning: The training process will take about 5 minutes
Since word_to_vec_map is huge this is highly inefficient way to get the first element. I guess this can be replaced with:
any_word = next(iter(word_to_vec_map.keys()))
After doing it the training time dropped significantly - to around 5 seconds rather than 5 minutes, the results are the same.
That’s a great point. And the only point of that access in the first place is just to get the shape of one of the vector embeddings so that they can initialize the average vector with zeros. If we also had access to the index_to_word map, we could make it even more efficient, but that would require another argument to the sentence_to_avg function. If we were going to do that, we could just make the shape an argument.
Note that they do the same inefficient code in the model function, but that only gets executed once. Your change really matters, because sentence_to_avg is invoked in the loop.
Wow! I just implemented your version and used the following logic to measure the performance of the full 400 iterations of training:
import time
np.random.seed(1)
tic = time.process_time()
pred, W, b = model(X_train, Y_train, word_to_vec_map)
toc = time.process_time()
print(f"training time {(toc - tic)} seconds")
print(f"type(pred) {type(pred)}")
print(f"pred.shape {pred.shape}")
With the original template code as written, the total time is about 317 seconds, which is just a bit over 5 minutes.
With your streamlined version, the time is about 2.8 seconds. Pretty amazing.
Just because we have fast computers these days doesn’t mean you don’t have to think about efficiency when you write an algorithm. As your example starkly demonstrates, it still matters if you code things in an efficient way vs an inefficient way.
@paulinpaloalto, this is related to the issue I reported yesterday, in the sentences_to_avg() function, about how making a list of the keys is exceedingly slow.
Oh, sorry, now that you mention it I did see that go by, but didn’t put two and two together when I saw this thread. Thanks for taking care of filing the actual change request about this.
I have not submitted a support ticket for the item I discovered yesterday, because that was a tip to add to the FAQ about how students should NOT write the code in sentences_to_avg().
But I will now submit a support ticket about the issue reported in this thread, because it’s in code that is provided with the notebook.
I have not messed with the one in pretrained_embedding_layer, but the one that I tried rewriting with Pavel’s new version is in sentence_to_avg. That one makes a huge difference.
So that makes at least 3 occurrences of that inefficient code.
I just tried the experiment of running the model.fit() cell that trains Emojify_V2 with the existing code and then with Pavel’s change in pretrained_embedding_layer and the 50 epochs take between 16 and 17 seconds in both cases. So I think the inefficient code in pretrained_embedding_layer is only executed once and is not an issue. But if we’re going to suggest fixing it, might as well do it in all three places.
It looks great. You captured everything I was aware of.
Do you have the actual example of the bad code that the student wrote in the other thread about sentence_to_avg that caused his notebook to appear frozen? My only suggestion would be that it might be worth adding that just to give the developers a clear picture what we need to avoid.