C2-W1 - prepare_data.py gives errors

I get the following error when running the script. I really need this working in order to continue, please respond quickly.


CalledProcessError Traceback (most recent call last)
in
3
4 # import the prepare_data.py module
----> 5 import prepare_data
6
7 # reload the module if it has been previously loaded

~/src/prepare_data.py in
26 subprocess.check_call([sys.executable, “-m”, “conda”, “install”, “-c”, “pytorch”, “pytorch==1.6.0”, “-y”])
27
—> 28 subprocess.check_call([sys.executable, “-m”, “conda”, “install”, “-c”, “conda-forge”, “transformers==3.5.1”, “-y”])
29 from transformers import RobertaTokenizer
30

/opt/conda/lib/python3.7/subprocess.py in check_call(*popenargs, **kwargs)
361 if cmd is None:
362 cmd = popenargs[0]
→ 363 raise CalledProcessError(retcode, cmd)
364 return 0
365

CalledProcessError: Command ‘[’/opt/conda/bin/python’, ‘-m’, ‘conda’, ‘install’, ‘-c’, ‘conda-forge’, ‘transformers==3.5.1’, ‘-y’]’ died with <Signals.SIGKILL: 9>.

Hi @PDS_Mentors,

Can one of you help here ?

Thanks,
Mubsi

Hello Sacha ,

Did you complete these steps before running the current cell?

  1. Open the file src/prepare_data.py. Go through the comments to understand its content.
  2. Find and review the convert_to_bert_input_ids() function, which contains the RoBERTa tokenizer configuration.
  3. Complete method encode_plus of the RoBERTa tokenizer. Pass the max_seq_length as a value for the argument max_length. It defines a pad to a maximum length specified.
  4. Save the file src/prepare_data.py (with the menu command File → Save Python File).

Thanks for reaching out. Yes I did. The stupid thing is that it got stuck on importing the script to run it on an example. However once I run it as a script on the sklearn container it all worked. So somewhere locally in the course environment something is/was not working properly.

1 Like

I am continuously facing the same error in Exercise 2

#######################################################################################################
Please check that the function ‘convert_to_bert_input_ids’ in the file src/prepare_data.py is complete.
#######################################################################################################

did you set the max_length in the script?

Dealing with the same problem here, curious if there was a way to fix this issue?

It has not been fixed.

Did someone manage to resolve this since i am stuck with same getting errors, i have done needed changes on python code in src as well

Once Sagemaker has loaded, click “Data Science” in the top right hand side. Then there should be a dropdown menu that says something like change kernel, change it to “Data Science 2.0”, sometimes you need retry it, but once it is loaded it solves the dependency issues :slight_smile: