W3 W1 Exercise 6

I am a little puzzled about how the SKLearn evaluation part of this notebook works. (Exercise 6)

We have trained and optimized a BERT model and selected the best.

We then seem to be evaluating using, I guess, some sort of sklearn container?

The BERT Model is presumably a set of weights for a Deep Learning model.

How is it possible to perform inference on a generic sklearn container using a BERT model?


I have the same question. Here’s what I speculate:

  • When we train the model we use BERT from PyTorch
  • The generated model (the candidates) are *.tar.gz file
  • When we evaluate model, the SKLearn treat it as a Processing job, and use the model (*.tar.gz) file to process the data. Thus it actually completes the inference.

See the comments in the Notebook:
“To perform model evaluation you will use a scikit-learn-based Processing Job. This is essentially a generic Python Processing Job with scikit-learn pre-installed. You can specify the version of scikit-learn you wish to use. Also pass the SageMaker execution role, processing instance type and instance count.”

And in the Notebook cell the code is:

from sagemaker.processing import ProcessingInput, ProcessingOutput