Week 3 Assignment encountered scikit error

  • Link to the classroom item you are referring to: [Planar Data Classification with One Hidden Layer | Coursera] (Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera)
  • Description: I encountered the following error when running 3 - Simple Logistic Regression
    C:\Program Files\Python312\Lib\site-packages\sklearn\utils\validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
    y = column_or_1d(y, warn=True)

May I ask how I can proceed?

this is how you proceed, change the shape of y to (n_samples, ), you could use numpy.reshape() or maybe numpy.ravel() as it suggests! Try to experiment with those see if you can get the right shape!

I did not have this problem in that notebook and I did not have to reshape Y to be a 1D array to get things to work. Please show us the complete exception trace that you are getting. That will make it more clear what is going on, rather than just looking at the error message by itself.

The only places we use sklearn in this assignment are when they use that early in the notebook to show that Logistic Regression does not do a very good job on this prediction problem. I just looked at that logic to refresh my memory and the Y value we are given has shape (1, 400), meaning that it is a row vector. The logic they gave us to train the scikit LR module does a transpose of Y to make it a (400, 1) column vector and that worked fine in my notebook.

Did you modify any of that given code or are you perhaps running this locally instead of on the Coursera website?

1 Like

Thank you for the reply. Yes I was trying to run this locally so I can familiarize with my own machine as well. Anyway I will use the one in Coursera directly, thank you.

Ok, that makes sense then. You were hitting a “versionitis” problem: the notebooks in this course were last updated in a major way in April of 2021, so they use the versions of all packages that were current at that time. There is no guarantee they will work if you run them locally with the latest versions of everything. E.g. in this case it looks like scikit changed the definitions of some of their APIs in a way that is not backwards compatible.