What dimension should i consider for parameters W,B,X and on what factors does it depands?
The w, b, and x are initialized randomly, and for this, you can easily do it with numpy or with tensorflow.
The training of the w, b, and x depends on the cost function, the data, and the gradient descent algorithm. The challenge here is the gradient descent algorithm. Tensorflow handles the calculation of all gradients for us, and if you want to do it without tensorflow, then you will need to work out the maths and translate that into code like assignments in course 1 and 2. Also, Tensorflow has optimizer like the Adam Optimizer ready for us to use, which means that implementing it yourself requires some time to learn about it.