PTuT
July 22, 2021, 10:11pm
1
Running cell 7.4, I am getting ‘Failed’ when I run the pipeline. Everything in previous cells ran without an issue. The output of running this cell:
Executing
Please wait…
Please wait…
Please wait…
Please wait…
Please wait…
Please wait…
Please wait…
Please wait…
[{‘PipelineExecutionArn’: ‘arn:aws:sagemaker:us-east-1:930856572204:pipeline/bert-pipeline-1626986539/execution/b41ikr5p0jvl’,
‘PipelineExecutionDisplayName’: ‘execution-1626990693520’,
‘PipelineExecutionStatus’: ‘Failed’,
‘StartTime’: datetime.datetime(2021, 7, 22, 21, 51, 33, 354000, tzinfo=tzlocal())},
{‘PipelineExecutionArn’: ‘arn:aws:sagemaker:us-east-1:930856572204:pipeline/bert-pipeline-1626986539/execution/8hbk56myyf3z’,
‘PipelineExecutionDisplayName’: ‘execution-1626988115519’,
‘PipelineExecutionStatus’: ‘Failed’,
‘StartTime’: datetime.datetime(2021, 7, 22, 21, 8, 35, 324000, tzinfo=tzlocal())}]
CPU times: user 2.51 s, sys: 126 ms, total: 2.64 s
Wall time: 7min 22s
Exercise 6:
Can someone help with this?
Thanks.
PTuT
July 22, 2021, 11:09pm
2
Update: On 3rd time, it seems to go through:
Please wait…
[{‘PipelineExecutionArn’: ‘arn:aws:sagemaker:us-east-1:930856572204:pipeline/bert-pipeline-1626986539/execution/gn27d9dditbx’,
‘PipelineExecutionDisplayName’: ‘execution-1626992760198’,
‘PipelineExecutionStatus’: ‘Succeeded’,
‘StartTime’: datetime.datetime(2021, 7, 22, 22, 26, 0, 63000, tzinfo=tzlocal())},
{‘PipelineExecutionArn’: ‘arn:aws:sagemaker:us-east-1:930856572204:pipeline/bert-pipeline-1626986539/execution/b41ikr5p0jvl’,
‘PipelineExecutionDisplayName’: ‘execution-1626990693520’,
‘PipelineExecutionStatus’: ‘Failed’,
‘StartTime’: datetime.datetime(2021, 7, 22, 21, 51, 33, 354000, tzinfo=tzlocal())},
{‘PipelineExecutionArn’: ‘arn:aws:sagemaker:us-east-1:930856572204:pipeline/bert-pipeline-1626986539/execution/8hbk56myyf3z’,
‘PipelineExecutionDisplayName’: ‘execution-1626988115519’,
‘PipelineExecutionStatus’: ‘Failed’,
‘StartTime’: datetime.datetime(2021, 7, 22, 21, 8, 35, 324000, tzinfo=tzlocal())}]
CPU times: user 10.1 s, sys: 412 ms, total: 10.5 s
Wall time: 32min 31s
Nothing was changed it just went through. Very frustrating.
So all finished well so if someone has this issue, sometimes just trying to rerun it again and again may work.
bj.kim
July 23, 2021, 2:29am
3
Hello @PTuT ,
I think there is nothing wrong with your code. Can you please see the log in your pipeline and check whether there is any issue or not?
You can check the log via Studio and CloudWatch. Please see the below image if you are not sure how to check the log via the SageMaker Studio.
For the test reason, I just slightly changed my code to occur an error in the prepared_data.py.
Best regards,
PTuT
July 23, 2021, 10:11am
4
Hi Kim,
Thanks for your reply. I just tried 3 times to run it and on 3rd time, the charm and it went full way without an issue … I do not know what kind of ‘night bug’ the pipeline experiences but all good now and the exercise is finished.
bj.kim
July 24, 2021, 11:57pm
5
Hi Alex,
Sorry for your bad experiences. However, I’m happy you are okay now
As an engineer, we are learning in the unknown error even though it’s not feel good.
Happy learning