Is it possible to extend that feature extraction algorithm to multi class classification?

I would like to know if the feature extraction algorithm used in week one can be extended to multi class classification… I mean, if it still makes sense in such a case.

Hi Mauricio_Toro,

You can use countvectorizer for multi-class classification. See, e.g., this post.

1 Like

but you mean to use tf_idf? The problem with tf_idf is that it will be a very sparse matrix of features. Let’s say I have 50 classes and 30,000 unique words. I was wondering if I can sum the frequencies for each of the 50 classes and just get 50 features to represent the texts, as a generalisation of the sum of frequencies presented in week 1 for binary classification… ?

1 Like

Hi Maurice_Toro,

As I indicated in my response to another question you asked, I feel countvectorizer is used here mainly as a pedagogical tool. The fact that it can be used for multi-class classification does not mean it is the most efficient one. You will find in the rest of the specialization that more efficient ways exist to extract features.

1 Like