Week 3 (General Question)

I just had a general question. So by backpropagation and gradient descent, we tweak the weights and bias to minimize the average error over all training examples. Are there any techniques that somehow use backpropagation and gradient descent to tweak the activation functions? Maybe the constants used in activation functions or something like that.


Hi @shaheer4; welcome to the DL Specialization. From a computational perspective it could be done–in principle. One could implement such a scheme using the deep learning library Tensorflow 2, which you will be meeting later on. That said, I know of no such application.

1 Like