In the examples and assignment there were some examples to turn an string to integer using tft.compute_and_apply_vocabulary. What is the best way to extract features from free-format text using TFTransformers. I see tft.ngrams but an example would be appreciated.
the question you have made is too big to be answered in few sentences.
Just to give you an idea: the subject of how to effectively extract “features” from a text is at the heart of the entire NLP.
If you’re are able to effectively transform a sentence in a set of numerical features, then you can go on and train a classifier (or regressor) to get your model. You need only to have data to train your model.
The way to go today is to use large NN models like: BERT, Roberta, GPT-3. If you go on the Huggingface site you’ll find a large set of pre-trained model that can be used… but these are very large models, with hundreds of millions of parameters. Understanding the structure of the language is not an easy task.
So, in my honest opinion, it is not possible to give a simple answer to your question. But, if you want to go in deep there are many good books (and courses) on the NLP subject. I would recommend to start with the version 2 of the F. Chollet book (Chollet is the main designer of Keras), which you can find on Manning’s site.