Unfortunately I run out of time and ideas how to fix it. Anyhow, it seems everything was done correctly up to 4.2 - DAG for Songs Data in RDS Source and your prompt assistance is very much needed.
Hello @MiloAndelic
I was not able to reproduce the issue, and my DAGs ran successfully. Can you please tell us which of the tasks in your DAG fails and provide the logs?
You can select each of the tasks from the chart like this:
Hello @MiloAndelic
Thank you for the details; I needed to know which task from your DAG is failing. For rds_extract_glue_job, the parameter SCRIPTS_BUCKET_NAME is used, which you are required to fill in in step 4.2.2. These parameters should be copied from the terraform outputs. Did you do this correctly?
As per your problem with Jupyter notebook, the address you provided should contain a port number at the end. You might have missed that while copying the address.
Thanks @Amir_Zare and please note that the address provided for Jupyter notebook was with a port number at the end.
As suggested I have just tried to collect the requested details but wasn’t able to open the lab at all due to “Your total lab spend of $20.135681 has exceeded the total budget of $20”. Can you please sort it out quickly as I need to complete it soon ?
Our lead lab engineer is notified when a learner submits the form. She’ll work through the backlog of requests as soon as possible. Thank you for your patience.
Thanks @Jessica-DLAI for helping resolve the issue! Everything ran smoothly without any Airflow DAGs hiccups this time, your and @Amir_Zare support is appreciated - have a fantastic day!
in deftunes_song_pipeline.py should contain the values of the file outputs.tf in the terraform folder? Which values exactly? Pass them as a string, as a variable?
I can find output “scripts_bucket” and output “data_lake_bucket” but still not sure. Passed the value of those outputs as a string and still problems.
Hello @hcara
Sorry for the inconvenience. When you run terraform apply, the terraform outputs are shown in the command line. Two of these outputs are data_lake_bucket and scripts_bucket. You are supposed to copy these values from the terminal and paste them into the python scripts, replacing the placeholders.