Hello -
I was looking for a specific example in this course to relate this question to, but couldn’t find it exactly, or maybe I missed it. The class 2 week 2 assignment is close, except no plot is actually made. This question may be kind of a long-shot, but hopefully someone can give some thoughts:
The scenario I am interested in is where there are more than 3 inputs into a NN, with either a binary classification or multi-class classification output. For the sake of discussion, if we have 5 inputs (features) and a binary output 1 or 0, then it would be great to be able to plot a decision boundary, but doing so with all the inputs seems impossible to visualize. I think the next best thing would be to pick 2 inputs and then plot a decision boundary for those 2 inputs, but the decision boundary is based on your model’s parameters (weights), (if this wasn’t an NN, but a logistic regression, then we could do some algebra and back-solve for the two features we are interested in). I see plenty of explanations all over the internet talking about how to plot a decision boundary by setting up a grid/mesh on the plot and running through the features and plotting up the ones that are near your threshold, but those are almost exclusively for 2 feature problems.
I can’t find much of anything for multiple features. Questions:
1.) Is there a lab I missed that has a decision boundary plotted for a situation I described above?
2.) Is plotting the multi-featured output on only 2 of the features not done very often and that is why I can’t seem to find much information on this topic?
3.) I could vary one feature while holding all of the others constant to get an output, and repeat the process until I cover all of the possibilities, but that seems inefficient and was wondering if this has been done in a function in a library already somewhere that I’m not aware of.
You cannot make a 2D plot that shows the decision boundary if there are more than two features.
There is no way to render a plot with more than this number of features on a flat 2D display.
Since it’s not possible to plot a complex data set, we don’t rely on using plots of the data, or of the decision boundary.
A method you can use in this situation is provided later in the course.
It involves plotting the cost value, and comparing the results using different subsets of the data set - one set you use for training, and another set you use to validate the results.
I think sampling decision values over the feature space of interest is a very reasonable approach, because this approach can be applied to any model. Generally speaking, more advanced algorithm usually comes with constraints or requirements on the model that might not be always applicable or favourable, so I really doubt if there is any library for such algorithm. I have not seen such library before.
I think human is used to visualize 1, 2, and 3 dimensional plots. To visualize higher dimensions, it requires a creative way and I believe some training for the viewers. Also, the concept of “boundary” is pretty spatial, and pursuing a boundary line might limit yourself to <=3 dimensions. Also, generally speaking, I think taking 2 features and leaving others out to make a plot won’t give you a meaningful boundary.
Navead @naveadjensen , if I were you and if I wanted to explain my model, I would not try to look for a general approach for this. To explain something, it really requires your insights, and knowledge about the problem itself. They are necessary for you to distill useful information for presentations.
Although I have no suggestion for you to plot high dimensional boundary, I know many people try to visualize high dimensional data. Do a bit of research about it and make visualization of the useful information distilled could be a way out.
I agree it is not possible in the way that you have explained it. What I am thinking of is possible, but probably meaningless in the way I am thinking about it, your post has helped clarify that for me.
I am not thinking of it in the correct way. Thanks!