In the lecture video Andrew says that because 1/3 of the set are cats, then the model will predict ‘Not Cat’.


I am wondering what happens if the set had example 1 cat and 1 dog (and there were no more splits due to maximum depth, and/or information gain below a threshold, and/or number of examples below a threshold), what would it predict?
Thanks!