I just don’t know why I cannot get my head around what min_samples_split is actually doing… maybe I’m too exhausted. Can someone please explain what does this line mean… “The minimum number of samples required to split an internal node” ?
The minimum number of samples refers to what is the minimum number of datapoints you can have in a group before you stop splitting.
So imagine arriving on a node, you have a group of 10 cats/dogs. Lets also imagine min_samples_split = 4. Because your group is 10 > 4, you can still split that group into two groups at that node.
Now lets imagine that node breaks the group of 10 into a group of 7 and a group of 3. The group of 7 > 4, so it will be further split at the next node. However, the group of 3 < 4, so because of the rule we set up, that group cannot be split anymore and it’s a leaf node.
Hope this helps, but let me know if this is still not clear.