Why S3+Athena for analytsts to query instead of a relational database?

marinafuster · June 22, 2025, 12:49pm

I was watching the video Lab Walkthrough - Introduction to the Lab from week 2, where you get an introductory exercise with AWS Glue. The basic pipelines goes as follow:

ingesting from RDS
transforming existing normalized schema to star schema
loading into S3 bucket (from which an analyst will query using Amazon Athena)

I am not sure why choosing S3+Athena over using another relational database after the transformation. Is there any technical reason for this?

rkthoya · June 22, 2025, 1:29pm

Hi Marina.

Good catch. I hadn’t thought about it myself and I’m unsure about the answer. But if were to guess, I’d imagine it to be either a cost thing or that S3 + Athena essentially gives you a data lake architecture which would be handy at some point if the structure of the data was to evolve. Only a guess though. Perhaps someone from the team might see this and give us a sure answer.

marinafuster · June 24, 2025, 12:41am

For what I’ve been reading, S3+Athena offers

lower costs (parquet files stored in S3 and Athena is pay per query) than RDS, especially for ad-hoc queries to investigate particular issues instead of ongoing reporting efforts.
more complex initial setup from engineer perspective compared to RDS, which is just configuring the relational database (but not dramatic if you have a data engineer and compensates later for scalability)
better scalability when having data in the TB or PB scale than RDS

There is also some discussion about better performance on massive datasets but I am not sure honestly. Apparently, this is a common strategy after ETL jobs.

If anyone from the course or from the audience has any extra insights, comments are very much welcomed

Topic		Replies	Views
Why using S3 instead of RDS? Introduction to Data Engineering week-module-2 , coursera-platform	1	33	April 15, 2025
Redshift & DLH architecture notes & feedback Data Storage and Queries week-module-3 , coursera-platform	1	11	January 3, 2025
Week 2, Gradeed Assignment Introduction to Data Engineering week-module-2 , coursera-platform	5	79	December 18, 2024
C3W2-Quiz discussion item for snapshots Data Storage and Queries week-module-2 , coursera-platform	2	13	January 15, 2025
Lab 1 Problem on Creation of S3 Bucket for Amazon Athena Query Results Introduction to Data Engineering week-module-2 , coursera-platform	18	162	October 21, 2024

Why S3+Athena for analytsts to query instead of a relational database?

Related topics