Hello, @emi2025,
I have not used it for a long time, and I am not experienced in that with Neural Network, only to share some views here.
Residual analysis lets us discover some potentially missing terms in our model, such as a quaduatic term if the residual plot looks quaduatic. For this case, certainly we can do feature engineering to add some quaduatic terms and look for improvement which I think is good if we already know what those terms should be. However, if we don’t know or if the terms are not in any simple mathematical form, since hidden layer adds non-linearity, then instead of treating the plot as a signal for adding certain terms, it could as well be a signal for a larger network. However, we have also learned about Variance and Bias (in MLS Course 2), so high Bias might also have hinted us to use a larger network. The challenge, then, is how you decide if the current J_{train} (refer to C2W3 lecture Diagnosing bias and variance) is too high.
On the other hand, if, instead of additional terms, the plot shows a possible “unequal variance” (i.e. heteroscedasticity) situation (a plot of such below)
then, traditionally, it could hint us to “stablize” these variances with some transformation on the label such as \sqrt{y}, \frac{1}{y}, and so on… These transformations, of course, might also be discovered by your domain knowledge, if not from the residual plots. Again, if, for example, \sqrt{y} can work, because \sqrt{y} = f_1(x) \implies y = f_1(x)^2 \implies y = f_2(x) where f_1, f_2 are some neural networks, we might also say that it is possible for us to change our NN from f_1 to f_2 to take care of that transformation.
Having said the above, it seems that residual analysis was not quite necesary, but as a tool and if you know the tool well, then it may reveal some details that your other tools (e.g. Bias and Variance) can’t directly. If I knew a quadratic term and \sqrt{y} were required, then I would apply them right the way instead of just guessing what my next NN should be, because I could have a lot of problems on my way ahead and wouldn’t it be wonderful to get rid of the obvious ones first?
Lastly, as I said, I have not used it for some time. Although I could share some views based on my understanding, it also implies that a full understanding of the tool is needed to give you the best answer.
Cheers,
Raymond