Illustrate importance of vectorization

Colin · April 16, 2021, 10:00am

Hello !

In python basics with numpy, the vectorization does not seem to bring much in your example in terms of computing time because the matrices are too small (as you point out).

What you could do is to generate huge matrices with 10^5-10^6 elements or so to show the real power of vectorization.

Cheers,

Colin

petrifast · April 16, 2021, 1:08pm

Continuing the discussion from Illustrate importance of vectorization:

Hi @Colin,

Great observation, thank you for pointing it out.

On small data sets, the speed difference is indeed not meaningful. That means small datasets are great for practicing using vectorization. So if I have a calculation where I use a for loop maybe I can rewrite it using one of the numpy operations.

When I’m playing around with a code, I might first write the answer using a for loop, get it to work and then write a vectorized version of the calculation. Having two versions lets me compare the two answers which will be same if my code is correct. I can also time the two versions and try using a larger data set for a comparison.

I will then erase the for loop version from the final version and just keep the vectorized version.

Keep up the good work @Colin!

Best,
Petri

PS. One can add %%time to the top of a notebook cell to get the amount of time that was spent doing a calculation. Please remember to remove it before submitting the assignment.

GordonRobinson · April 16, 2021, 1:57pm

For me, the importance of vectorization goes beyond the speedups from loop elimination.
I view it as a change in thinking to the “big operations on big (multi-axis) values”. Those big values will have 3, 4 or more axes in the later courses.

paulinpaloalto · April 16, 2021, 3:09pm

That’s a good point, @GordonRobinson. Another positive side effect is that the vectorized version is typically less code and simpler code. And it runs faster. What’s not to like about that?

crisrise · April 16, 2021, 6:04pm

Dear @Colin,

the important idea around numpy and vectorization is the possibility to use the SIMD instructions that are available in the different modern processor (with differences depending on the architecture).
Couple of links:

I find this really impressing

paulinpaloalto · April 16, 2021, 6:46pm

Yes, @crisrise makes a really important point here. When you call a vectorized numpy routine, what is actually happening is that it in turn is calling a lower level assembly language routine that uses special CPU instructions that are specifically designed to make vector computations efficient. It’s not that there’s another “for” loop buried in the library that is somehow a “better” for loop than you could write in python. It’s literally different CPU instructions specifically designed to make this type of operation efficient.

Colin · April 16, 2021, 8:22pm

Yes, I know about SIMD, on CPUs and GPus. My only point was to suggest not to show the improvement with examples where there is no improvement. Just go for large vector operations if you want to illustrate it. As it is now, the tutorial fails to show the gain of vectorizing. Just my 2 cts.

crisrise · April 17, 2021, 6:07am

Thanks @Colin I understand the point you are making and I totally agree. I think experiencing the speed advantage in the practical case is a very good learning experience.

Topic		Replies	Views
Week 2 Vectorization and Why Num.py is fast Neural Networks and Deep Learning	3	811	May 4, 2022
Week 2 Optional NumPy Lab Neural Networks and Deep Learning	3	520	November 12, 2022
Time complexity of vectorization Neural Networks and Deep Learning week-2	3	199	March 19, 2024
Compute time slower with numpy vectorizarion compared to for loops Neural Networks and Deep Learning	1	519	November 16, 2021
Vectorization & parallel processing, where to learn more? Supervised ML: Regression and Classification week-2	4	552	June 24, 2022

Illustrate importance of vectorization

Related topics