C2W3 pipeline execution failure: environmental conflict

I bumped into this issue a few times. From the log it seems to be caused by environmental conflicts. It happen with multiple steps (e.g. Processing, EvaluateModel).

Can the lab designers double check on the environment set up? It’s very frustrating!

I downloaded the logs:

log-events-viewer-result (Processing).csv (69.4 KB)
log-events-viewer-result (EvaluateModel).csv (21.5 KB)

I dont understand, please can you help explain to me

If you read the logs, there are many items about resolving environmental conflict, meaning the package versions are not compatible to each other.

Experiencing the same issue. It seems there is an issue with the pandas version compatibility with some of the PyTorch 1.6 dependencies.

Hello @seanjiang I am experiencing the same error observed in the attached CloudWatch logs. From the SageMaker Console this is what I am seeing

" Failure reason
AlgorithmError: See job logs for more information"

And from the cloud watch log it is exactly the same error as seen above. It seems there is an issue with the version of packages been used. I tried different version of PyTorch 1.8 but still gave be the same issue. Please advise as this is the last module to complete this the course.

Running into the same issue.

@Mubsi and @Okparaji_Stanley Can this issue which we have raised with PDS2 and PDS3 be resolved as it has becoming a blocker for us to finishing the course. Can we have some feedback as I want o complete this asap before been billed again. Thanks.

@seanjiang @Arijit_Sean_Gupta I found a solution to this error. Comment out the conda version of the package installations below in the prepare_data.py and evaluation_model_metrics.py. Use pip instead.

# subprocess.check_call([sys.executable, "-m", "conda", "install", "-c", "pytorch", "pytorch==1.6.0", "-y"])
subprocess.check_call([sys.executable, "-m", "pip", "install", "torch==1.6.0"])
# subprocess.check_call([sys.executable, "-m", "conda", "install", "-c", "conda-forge", "transformers==3.5.1", "-y"])
subprocess.check_call([sys.executable, "-m", "pip", "install", "transformers==3.5.1"])

Thank you everybody. The prepare_data.py and evaluation_model_metrics.py files are updated now - the problem should be fixed.

1 Like