My Json Tranform job keep failing and shown the following error:
AnalysisException: Column ‘user_location’ does not exist. Did you mean one of the following? [user_agent, user_id, session_id, ingest_on, session_items, session_start_time];
I tried to browse the forum to find a solution, and note that it could be problem of extract work. Thus,
-
I first check the json files in S3: the json file under landing_zone/api/users/2025_XX_XX/ only contains session info instead of users, which means the extract work is wrong.
-
Then I tried to modify the glue.tf file under extract_job as mentioned in other topics in the forum, tried to add “/users“ after “–api_url“, no success. The extract job run gave error saying not recognise path “api_url/users/sessions?….“
-
I check the file “de-c4w4a1-api-extract-job.py“ and realise that the request_api_url has the only format of “f"{api_url}/sessions?start_date={request_start_date}&end_date={request_end_date}"“
which is not the request url used in C4_W4_Assignment_1.ipynb section 2.7, i.e. users_request = requests.get(f’http://{API_ENDPOINT}/users’) I think this is the problem, i.e. user info is not fetched into S3 but sessions info instead, then in the transform job it could not find “user_location“.
Please advice how could I solve this problem.
Thank you.



