How to install additonal python packages not inside the pytorch image

Dear mentors, I am tring on my own in sagemaker using my personal AWS.
Inside my, nltk and pytorch is needed.
In sagemaker studio notebook, i setup my pytorch estimator as below:

estimator = PyTorchEstimator(
    py_version='py3', # dynamically retrieves the correct training image (Python 3)
    framework_version='1.6.0', # dynamically retrieves the correct training image (PyTorch)

then i used

to execute the training. However, i got “module not found error nltk”
In this case, i thi nk the default pytorch image does not have nltk installed.
Can advise me how to install extra packages needed by my so that i can successfully do the training on SageMaker? thanks in advance!

dear mentors, i have found the solution as per this website “Use PyTorch with the SageMaker Python SDK — sagemaker 2.59.5 documentation
The problem for me was that my requirements file name was not “requirements.txt” but “requirements_train.txt” in “src” directory. so not recognized by the estimator.
After i renamed to “requiremens.txt”, then nltk package can be installed successfully. Thanks anyway! :slight_smile:

1 Like

Awesome, @thicc_fart! :slight_smile: Thanks for sharing!!