C4W4 -Capstone Project Part 1: VS Code error

I have trying to complete the assignment 7-8 times. Every time I got the error while deploying the files for 4.2 by running “terraform apply”. The jobs in 4.1 session were successful. I took about 10 mins to complete previous files not 3 hours. This time I updated all files then deployed the files. I still got the error.

What is the error? Any root cause or how I can navigate the root cause? Could you advise me for this error please? Thanks

VSCodeEndpoint: deployment error from “terraform apply” commands
The terminal process “/bin/bash” terminated with exit code: 1.

No error in the jupyter_output.log:

[I 2024-11-25 21:53:49.691 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server
[W 2024-11-25 21:57:12.661 LabApp] Could not determine jupyterlab build status without nodejs
[I 2024-11-25 21:57:17.391 ServerApp] Writing notebook-signing key to /config/.local/share/jupyter/notebook_secret
[W 2024-11-25 21:57:17.392 ServerApp] Notebook C4_W4_Assignment_1.ipynb is not trusted
[I 2024-11-25 21:57:18.176 ServerApp] Kernel started: 8e9b6aee-a621-4a59-b48f-cexxxxxxxx
[I 2024-11-25 21:57:18.836 ServerApp] Connecting to kernel 8e9b6aee-a621-4a59-b48f-cexxxxxxxx.
[I 2024-11-25 21:57:18.879 ServerApp] Connecting to kernel 8e9b6aee-a621-4a59-b48f-cexxxxxxxxx.
[I 2024-11-25 21:57:18.915 ServerApp] Connecting to kernel 8e9b6aee-a621-4a59-b48f-cexxxxxxxx.
[I 2024-11-25 21:59:17.544 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:01:17.626 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:01:44.002 ServerApp] Saving file at /terraform/assets/extract_jobs/de-c4w4a1-api-extract-job.py
[I 2024-11-25 22:03:30.298 ServerApp] Saving file at /terraform/assets/extract_jobs/de-c4w4a1-extract-songs-job.py
[I 2024-11-25 22:03:34.381 ServerApp] Saving file at /terraform/assets/extract_jobs/de-c4w4a1-api-extract-job.py
[I 2024-11-25 22:05:03.448 ServerApp] Saving file at /terraform/assets/transform_jobs/de-c4w4a1-transform-json-job.py
[I 2024-11-25 22:06:03.987 ServerApp] Saving file at /terraform/assets/transform_jobs/de-c4w4a1-transform-songs-job.py
[I 2024-11-25 22:07:42.923 ServerApp] Saving file at /terraform/main.tf
[I 2024-11-25 22:08:18.956 ServerApp] Saving file at /terraform/outputs.tf
[I 2024-11-25 22:09:31.282 ServerApp] Saving file at /terraform/modules/extract_job/s3.tf
[I 2024-11-25 22:11:33.835 ServerApp] Saving file at /terraform/modules/extract_job/glue.tf
[I 2024-11-25 22:12:45.054 ServerApp] Saving file at /terraform/modules/extract_job/glue.tf
[I 2024-11-25 22:14:44.318 ServerApp] Saving file at /terraform/modules/transform_job/glue.tf
[I 2024-11-25 22:14:48.461 ServerApp] Saving file at /terraform/modules/transform_job/glue.tf
[I 2024-11-25 22:15:37.055 ServerApp] Saving file at /terraform/modules/transform_job/s3.tf
[I 2024-11-25 22:15:41.056 ServerApp] Saving file at /terraform/modules/transform_job/glue.tf
[I 2024-11-25 22:16:57.802 ServerApp] Saving file at /terraform/modules/serving/iam.tf
[I 2024-11-25 22:17:47.365 ServerApp] Saving file at /terraform/modules/serving/redshift.tf
[I 2024-11-25 22:27:01.998 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:27:02.857 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:27:04.909 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:27:05.155 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb
[I 2024-11-25 22:27:05.396 ServerApp] Saving file at /C4_W4_Assignment_1.ipynb

Reboot VS in CloudShell following the steps in course
[cloudshell-user@ip-10-xxxxxx ~] nano -c restart_vscode_service.sh [cloudshell-user@ip-10-xxxxxx ~] bash restart_vscode_service.sh
VSCode server rebooted, try to connect again

log:
[C 2024-11-25 22:30:02.023 ServerApp] received signal 15, stopping
[I 2024-11-25 22:30:02.027 ServerApp] Shutting down 4 extensions
[I 2024-11-25 22:30:02.028 ServerApp] Shutting down 1 kernel
[I 2024-11-25 22:30:02.028 ServerApp] Kernel shutdown: 8e9b6aee-a621-4a59-b48f-xxxxxxxxxxx

then ran the VSCode “source scripts/setup.sh” but not even deploy files.
I still got the same error.

More info:

  1. My internet is high speed and stable.
  2. This issue started from last Sat and I also experienced it on last Sun.
  3. I had tried to clean my web browser data and rebooted my laptop. No luck.
  4. I had tried to reboot the VSCode server rebooted then ran the "
    source scripts/setup.sh" . No luck.

Hello @Adazhu
Sorry for the inconvenience. What do you mean when you say you updated all the files then deployed them?

Hi Amir,
Thank for your help.

In part 4, there are 3 tasks. I have no issue to complete 4.1 task and jobs were successful. However, each time I completed the files in 4.2 task and ran the terraform commands, then I got this error in the terminal of VSCodeEndpoint.

The last time I tried to update all 3 tasks in part 4 then ran the terraform commands, and I got this error again.

I tried to workaround by rebooting the VS in CloudShell but no luck. Per the error, I don’t understand it because I don’t have a bash file running except the reboot one that be provided by the course.

@Adazhu thank you for providing the details.
First of all, please do the parts one by one as is mentioned in the instructions.
Secondly, please take a screen record the next time you run terraform apply for transform tasks, so that we can see what exception is raised and what is causing the terminal to crash. I think the problem should be with the glue databases. If the error says some database already exists, you can do the following. Go to AWS Glue, select Databases from the menu on the left, select the database, and press delete.


Please, note that I took this screenshot from another lab. So, in your case, the database you may need to delete is probably different.

Hi Amir,

It isn’t the same issue as your screenshot. I checked the Glue before and after running 4.1 task and there was no table generated in Glue > DB of catalog.

As usual, my jobs in 4.1 were successful. It maybe caused by “cd terraform” since I was in the terraform directory from 4.1 task. When I ran “cd terraform” at 4.2.6. I got the same error code :1.

I tried the work around below.

  1. I tried to reboot the VS in CloudShell and then reloaded the VS endpoint and ran " source scripts/setup.sh".
  2. I double-checked my updated files for 4.1 and 4.2 tasks and they looked good to me.
  3. I ran 4.1 jobs again and they were successful.
  4. I deployed 4.2 files by running terraform commands, my jobs in 4.2 were failed.
  5. I got another error code: 100.
  6. I did check the Glue and there wasn’t a table generated.

I continue worked on 4.3 for Redshift setup and I got returned from 5.1 and 5.2. The de_c4w4a1_silver_db was created but it was empty - no table. I believe it caused my transform jobs failed.

I have done the above process few times. Sometimes I also got an error “the user doesn’t have permission to run glue job” after I rebooted the VS Console.

I need help for:

  1. how can I know why my 4.2 jobs were failed? I don’t see the any details in the log.
  2. After few times to reboot the VS console, the VS endpoint wasn’t able to load. I had tried to clean all my browser data and reboot my laptop but no luck. Then I had to wait for the AWS disconnect (3 hours). Could you advise me how to improve the affection for the VS reconnection?

Thanks! :slight_smile:

Hello again @Adazhu
You don’t need to run cd terraform in step 4.2.6 if you are using the same terminal from before. It may already be in the terraform folder.
When you say your jobs in 4.2 fail, do you mean that you are able to run terraform apply in step 4.2.6 successfully, but the jobs from step 4.2.7 fail? Can you provide a screenshot of the error you get and tell us exactly in which step it happens?
About the VS reboot, I suggest you don’t try rebooting if unnecessary. Even if the terminal crashes, you can open a new terminal, and you don’t need to reboot VSCode.

Hi Amir,

I tried to work on my assignment today and got the error after an half hour I login.


Then I tried to start the Lab and got an error - " Your toal lab usage time of 1800 mins has exceeded the total allocated time of 1800 mins". It can’t be restarted. Could you please advice me? Sorry about new issue. :frowning:

Sorry for the inconvenience @Adazhu
Please, fill out the lab refresh form, and our team will reset your lab budget so that you can try again.

I have submitted the form. Thank you.

At this moment, the terraform apply was running successful. Then I started to run my jobs 4.2 then they both were failed. I double-checked my files for 4.2 and they were looked good to me. However, I couldn’t get any tips from the log since no error.

I would like to know how to check why these two jobs were failed.

Thank you for your supporting timely manner. Happy thanksgiving to you and your families.

Hi Amir,

Could you please advice me for the error below? I had a same issue while running 4.2 jobs in Part 1 assignment so I report this issue here. Thanks.

I worked on part2. However I got an error “An error occurred (AccessDeniedException) when calling the StartJobRun operation: User: arn:aws:sts::160214670062:assumed-role/VSCodeInstanceRole/i-049804cd3eb2864b7 is not authorized to perform: glue:StartJobRun on resource: arn:aws:glue:us-east-1:160214670062:job/glue_api_users_extract_job because no identity-based policy allows the glue:StartJobRun action”.

I tried to run the jobs in Glue but no luck.

Hello again @Adazhu
Thanks, and happy thanksgiving to you and your loved ones too.
About the error you get in the command line, it is because you put glue_api_users_extract_job as the <JOB_NAME>. You should copy the output named glue_api_users_extract_job from terraform outputs and replace that in the command.
However, if you had run out of lab budget already, this wouldn’t have worked either. Since the lab refreshes are done manually and currently we’re having the US thanksgiving holidays, please wait until 2 working days after that and try the lab again when the refresh is done. I hope the issues are fixed by then.

Hey Amir,
For the Part 1 assignment, I still wait for the permission from the access.

Thank you for the finding for my Part 2 assignment. I have completed it with 100. However, my tasks were failed in Airflow and here are the details. Could you please advise me how to troubleshoot the issue in glue.py? Thank you so much.

VS console: all jobs were successful.

(jupyterlab-venv) abc@b2540389763a:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a2-api-users-extract-job --run-id jr_1cef28a5f932ec4c40904fab261dd2b8921572619df68e5db89de0d306c43dbe --output text --query “JobRun.JobRunState”

SUCCEEDED

(jupyterlab-venv) abc@b2540389763a:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a2-api-sessions-extract-job --run-id jr_5099e847e7ca5af6996c5bf57a57cad67e6faeb506e03f9464e52f5d3dce8cac --output text --query “JobRun.JobRunState”

SUCCEEDED

(jupyterlab-venv) abc@b2540389763a:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a2-rds-extract-job --run-id jr_1973d7e228ecca45b3e3192bda6b62e03f38c21620d4c0a5c246e6a8a333032d --output text --query “JobRun.JobRunState”

SUCCEEDED

(jupyterlab-venv) abc@b2540389763a:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a2-songs-transform-job --run-id jr_e691791600d704f4de6ce1a932cc69ee78ea552af197ef3c2b96c19f79f60893 --output text --query “JobRun.JobRunState”

SUCCEEDED

(jupyterlab-venv) abc@b2540389763a:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a2-json-transform-job --run-id jr_0cc29bb43014b7a00eed692268d7995feab722ebc532292453503d55c726a203 --output text --query “JobRun.JobRunState”

SUCCEEDED


File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/glue.py”, line 346, in _handle_state

raise AirflowException(job_error_message)

airflow.exceptions.AirflowException: Exiting Job jr_91fd58a6faedd4b167b33dad1f2cb30fbb8eb3c50af689769ed9f81765cbcef1 Run State: FAILED

[2024-12-02, 21:01:45 UTC] {local_task_job_runner.py:266} INFO - Task exited with return code 1

[2024-12-02, 21:01:45 UTC] {taskinstance.py:3900} INFO - 0 downstream tasks scheduled from follow-on schedule check

[2024-12-02, 21:01:45 UTC] {local_task_job_runner.py:245} ▲▲▲ Log group end

I have followed your steps to check the data quality in AWS Glue. I don’t have error in the three tables. Here is one of the check for song table.

In the last step of data visualization in Superset in the Part 2, I don’t know the details of the required data in the screenshot. Is that meaning my connection of Redshift failed?

Hi Amir,

I am able to access to Part 1 assignment. However, my jobs in 4.1 tasks were successful and my jobs in 4.2 tasks still were failed. I had reviewed all the files in 4.2 tasks and they were looked good to me. Could you please advise me?

VS Console:
(jupyterlab-venv) abc@5117febb3ccf:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-json-transform-job --run-id jr_f48023fd46346c67385e4ad3f0248672154285b83e9d7a2fbe609dfcdba5251b --output text --query “JobRun.JobRunState”

FAILED

(jupyterlab-venv) abc@5117febb3ccf:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-songs-transform-job --run-id jr_e5893c534def718e7b07cd07102fda991840d3721a1f6c1bade3c3c4ed129747 --output text --query “JobRun.JobRunState”

FAILED

I tried to run the two jobs in Glue but no luck.

No error:

Hello @Adazhu
Since the jobs are failing, there should be something wrong with either the terraform files, s3.tf and glue.tf, or the asset files, de-c4w4a1-transform-json-job.py and de-c4w4a1-transform-songs-job.py. Please, click on the failed job from the AWS Glue console, go to the Runs menu, and select the failed run. You can see the logs there and add them here.
About the issues with other labs, please post them in separate treads as it’s becoming a little hard to follow this one. We will assist you there. Thank you.