Can't see any DAGS after running generate_dags.py

I am running generate_dags in the terminal and get an empty DAG folder…
Help :confused:

Hello @Chenko,

It seems you ran the command to create the dag_configs folder inside the src/template folder. You should be at the terminal at the project folder in step 5.2.1 in order to create the correct path instead:
mkdir -p src/templates/dag_configs

This should create the correct path and then try to copy the 3 config files inside that folder. Thank you:

Thanks for the response, it worked!
Now I get an error in the code when running the DAG…

Here’s my code, what am I missing here

Hello @Chenko,
As you can see in the is_deployable task you are taking the performance value from the train_and_evaluate task. Could you check in line 109 of the template.py that you return the performance. Hope it helps

Thanks for the reply…


Seems fine to me :confused:

Hello @Chenko,

I had a similar issue before when there were none integer values in the train set. It seems there is a bug in your code in the train_an_evaluate task:

        You used double curly brace {{ vendor_name }} instead of single ones:
        train = pd.read_parquet(f"{datasets_path}/{{ vendor_name }}/train.parquet")
        test = pd.read_parquet(f"{datasets_path}/{{ vendor_name }}/test.parquet")

        Use {vendor_name} instead:
        train = pd.read_parquet(f"{datasets_path}/{vendor_name}/train.parquet")
        test = pd.read_parquet(f"{datasets_path}/{vendor_name}/test.parquet")

Hi, still not working…
Is everything defined correctly in the pictures I’ve added?
Tried all different kind of things, nothing worked (only once when I’ve put hardcoded 409 instead of the performance var one time to check if it’s working)

Changed -


Thanks

Hello @Chenko,

You’ve changed lines 85-86 and looks correct. Your depedencies as well, could you check you didn’t make the same mistake in line 51 with {vendor_name}:

            f"s3://{Variable.get('bucket_name')}/work_zone/data_science_project/datasets/"
            f"{vendor_name}/train.parquet" <---Did you use {{vendor_name}}here as well
        ),

Unfortunately I am waiting for a lab refresh since I made too many tries. You could send me your template.py so I can check it. Thank you

Also the same here


Also the data_quality_task has finished successfully in the Airflow.

Hello @Chenko,

Yes the code is complete and looks identical to mine. I would check if you used the raw_data bucket and not the dags bucket when defined the Variable in Airflow UI and when copied the parquet files in step 3.2. Finally make a manual check that the template I saw is correctly updating the three files in the dags folder. Are you getting the same error after fixing the files. Hope it helps:

TypeError: '<' not supported between instances of 'NoneType' and 'init'

@Chenko Found it in line 122, you used:

performance = ti.xcom_pull(task_ids="train_and_evalute")

instead of:

performance = ti.xcom_pull(task_ids="train_and_evaluate")

Thanks