Lecture Video: Implementation Note, Parallelization

Hey Guys,
In this video, the instructor introduces the tf.scan() method, and depicts it’s working. Towards the end of the lecture, the following is stated:

It is important to know that these types of abstractions are needed for deep learning frameworks because they allow them to use GPS and compute in parallel.

But tf.scan() method inherently uses a for loop, which is a sequential computation in itself, so what kind of parallelisation is exactly referred to in this lecture?


Hi Elemento,

As I understand it, the reference to a for loop is just made to make it easier for learners to understand how the function works.

The source code includes a call of scan, which includes a parameter parallel_iterations with a default value of 10. This function calls control.flow_ops which in its turn calls gen_control_flow_ops with parallel_iterations passed as an argument. As I understand it, gen_control_flow_ops parallelizes the execution of the function passed to tf.scan() using C++.