C5W2_Assignment2 Slow training time due to code inefficiency

Pavel_Zahariev · January 16, 2024, 8:30pm

I saw that in the assignment we have the following piece of code twice:

any_word = list(word_to_vec_map.keys())[0]

There is also a warning: The training process will take about 5 minutes
Since word_to_vec_map is huge this is highly inefficient way to get the first element. I guess this can be replaced with:

any_word = next(iter(word_to_vec_map.keys()))

After doing it the training time dropped significantly - to around 5 seconds rather than 5 minutes, the results are the same.

paulinpaloalto · January 16, 2024, 8:44pm

That’s a great point. And the only point of that access in the first place is just to get the shape of one of the vector embeddings so that they can initialize the average vector with zeros. If we also had access to the index_to_word map, we could make it even more efficient, but that would require another argument to the sentence_to_avg function. If we were going to do that, we could just make the shape an argument.

Note that they do the same inefficient code in the model function, but that only gets executed once. Your change really matters, because sentence_to_avg is invoked in the loop.

I’ll file a change request about this.

Thanks for noticing this and solving it!

paulinpaloalto · January 16, 2024, 9:49pm

Wow! I just implemented your version and used the following logic to measure the performance of the full 400 iterations of training:

import time

np.random.seed(1)
tic = time.process_time()
pred, W, b = model(X_train, Y_train, word_to_vec_map)
toc = time.process_time()
print(f"training time {(toc - tic)} seconds")
print(f"type(pred) {type(pred)}")
print(f"pred.shape {pred.shape}")

With the original template code as written, the total time is about 317 seconds, which is just a bit over 5 minutes.

With your streamlined version, the time is about 2.8 seconds. Pretty amazing.

Just because we have fast computers these days doesn’t mean you don’t have to think about efficiency when you write an algorithm. As your example starkly demonstrates, it still matters if you code things in an efficient way vs an inefficient way.

TMosh · January 16, 2024, 9:51pm

@paulinpaloalto, this is related to the issue I reported yesterday, in the sentences_to_avg() function, about how making a list of the keys is exceedingly slow.

paulinpaloalto · January 16, 2024, 9:55pm

Oh, sorry, now that you mention it I did see that go by, but didn’t put two and two together when I saw this thread. Thanks for taking care of filing the actual change request about this.

paulinpaloalto · January 16, 2024, 9:57pm

Or if you meant a thread and not a change request, I am happy to take care of the github side of it …

TMosh · January 16, 2024, 9:58pm

I have not submitted a support ticket for the item I discovered yesterday, because that was a tip to add to the FAQ about how students should NOT write the code in sentences_to_avg().

But I will now submit a support ticket about the issue reported in this thread, because it’s in code that is provided with the notebook.

TMosh · January 16, 2024, 11:37pm

@paulinpaloalto and @Pavel_Zahariev

Regarding the instance of this code that causes the long training time:
Are you referring to the one in pretrained_embedding_layer()?

Because the other instance is in the setup code for “model()”, and if i am correct that line is only executed once.

paulinpaloalto · January 16, 2024, 11:43pm

I have not messed with the one in pretrained_embedding_layer, but the one that I tried rewriting with Pavel’s new version is in sentence_to_avg. That one makes a huge difference.

So that makes at least 3 occurrences of that inefficient code.

paulinpaloalto · January 16, 2024, 11:56pm

I just tried the experiment of running the model.fit() cell that trains Emojify_V2 with the existing code and then with Pavel’s change in pretrained_embedding_layer and the 50 epochs take between 16 and 17 seconds in both cases. So I think the inefficient code in pretrained_embedding_layer is only executed once and is not an issue. But if we’re going to suggest fixing it, might as well do it in all three places.

TMosh · January 17, 2024, 12:05am

@paulinpaloalto, thanks. I tried to get call of that in the issue I just submitted. Can you review it?

I might have to edit things a little. Or you can, feel free to.

TMosh · January 17, 2024, 12:19am

I’ll update the ticket.

paulinpaloalto · January 17, 2024, 1:26am

It looks great. You captured everything I was aware of.

Do you have the actual example of the bad code that the student wrote in the other thread about sentence_to_avg that caused his notebook to appear frozen? My only suggestion would be that it might be worth adding that just to give the developers a clear picture what we need to avoid.

Thanks for creating the github issue!

TMosh · January 17, 2024, 1:36am

Inside the for-loop over the words in the sentence, they had
if w in list(word_to_vec_map.keys())):
instead of
if w in word_to_vec_map:

I updated the ticket to include that detail.

Pavel_Zahariev · January 17, 2024, 8:20pm

That was my point. Thanks for taking the time to reproduce it and do this comprehensive analysis, I hope you will be around when I post my next issue.

Topic		Replies	Views
C5W2 emojify assignment Sequence Models coursera-platform	9	592	April 10, 2022
Sequence Models: Week 2: Emojify Exercise 1 - sentence_to_avg Sequence Models coursera-platform	1	802	November 22, 2021
Run time for part UNQ_C11 NLP with Classification and Vector Spaces week-module-4	3	530	June 9, 2023
C5W2A2 Emojify - KeyError : "algeria" Neural Networks and Deep Learning coursera-platform	2	396	August 22, 2023
W2 - Assignment 2: In second part, why do we get the index of the words instead of accessing the GloVe vectors directly from word_to_vec_map? Sequence Models coursera-platform	5	532	November 29, 2022

C5W2_Assignment2 Slow training time due to code inefficiency

Related topics