Tensorflow performance

And77 · July 22, 2024, 11:10pm

I implemented the algorithm in numpy as in the practise for optimization methods and did the same thing with tensorflow, but the tensorflow version was much slower.
I also made the model more complex, used another dataset. But even when activating multiprocessor usage in tensorflow, the direct numpy implementation is ~10x faster than tensorflow.
I would like to know, what is slowing down tensorflow here that much and is there a way to avoid that ?

Regards,
Andreas

TMosh · July 23, 2024, 12:25am

How fast TensorFlow runs depends on your computing platform.
It also depends on how efficient your model is.

TMosh · July 23, 2024, 12:26am

Keep in mind also that TensorFlow’s big selling point is not its efficiency, it’s the ease with which you an create complex models that work reliably.

rmwkwok · July 23, 2024, 1:40am

Hello, @And77,

I would first make sure the whole training process (including the model and the loss computation) is built as a tensorflow graph.

Introduction to tensorflow graph

Better performance with tf.function

Cheers,
Raymond

And77 · July 23, 2024, 10:30am

Thank you for the good hints!

However, when implementing the tensorflow version, I just used 5 times the default dense layers with tanh activation, and already built-in adam optimizer and mse loss function. So, I have no idea where to use the tensorflow graph function ?
Maybe, in numpy I coded the backprop manually, but tensorflow needs some more logic to decide for the right backprop calculations ?
Maybe it is just due to some calculation overhead (I did see a blank for loop iteration in python takes ~500 CPU cycles, while in C it is just a few) ?

Regards,
Andreas

rmwkwok · July 23, 2024, 11:13am

Oh! I thought you were re-implementing all the logic from scratch, but based on your reply, if you built the network with tf.keras.Sequential(...) and used tf.keras.losses.XXX as your loss function, then there was no need for using tensorflow graph function.

Overhead is a possible cause, but it is only significant if the training set size is small. Is it small? Btw, we can find the overhead by varying the training set size.

I think it would be useful to compare these numbers:

—	numpy	tensorflow
batch size	?	?
number of batch per epoch	?	?

One case is, If we do batch training in numpy, but mini-batch in tensorflow with like 5 batches, then I will not be surprised if numpy is faster.

And77 · July 23, 2024, 1:26pm

Hmm, I did both. In numpy I did it from scratch, but in tensorflow, I just used the default sequential and loss function.
The thing were I was wondering about is that numpy is 10x faster.

I used the same traing data and batch size for both and also get very similar losses on both variants.

Here are my values:
total training dataset ~50000 values, with width of X=5 and Y=1, both types are float.
batch size is 64.
For testing, I actually use 5 hidden layers with 18,13,9,5 and 3 neurons, but the effect is the same also with slightly other topologies. I assume, for big CNNs tensorflow performs better but I want to understand the bottleneck.

Thank you,
Andreas

rmwkwok · July 23, 2024, 2:24pm

50000 samples and 5 features per sample. Doesn’t look much to me. The network is not big, too.

I seldom train with numpy so I am not sure, but perhaps you might really want to find out tensorflow’s overhead: it does things in between two epochs. Also, the first epoch is usually slower than the rest.

Topic		Replies	Views
Tensorflow (Keras) model.predict() is much slower than numpy implementation Advanced Learning Algorithms week-1	6	793	January 21, 2023
Week 3 - Tensorflow takes more time than scratch implementation Improving Deep Neural Networks: Hyperparameter tun	1	545	June 22, 2021
Why do we use tf.keras.backend, rather than simply numpy? Custom Models, Layers and Loss Functions with TF week-2	3	136	July 15, 2024
Implementing model Introduction to TF for Artificial Intelligence ... week-2	5	565	November 4, 2021
What do we implement on notebooks? Improving Deep Neural Networks: Hyperparameter tun	2	536	December 21, 2021

Tensorflow performance

Related topics