I am confused about the naming…why do we say we have high variance when the model overfits? is this in anyway related to variance in statistics (square of standard deviation)?
Good to know about your curiosity related to variance. Let me try to help you on clarifying the same.
So High Variance means that points in the dataset are far away from mean and far away from each other. So when the model has high variance, even though the points are far away from each other, the model tries to fit each of the points accurately. Which hence fits perfectly on the seen data but not on the Unseen data, which we call as overfitting. Hence high variance indicates overfitting of the model.
There are many articles present on the internet regarding this, and one of them, which you can go through is http://scott.fortmann-roe.com/docs/BiasVariance.html
Hope this helps. let me know if you have any other questions. Happy to help!