Choice of using Lambda function in the Streaming pipeline

damolavictor · January 5, 2025, 6:32pm

Hello Team,

I am about starting the final lab for week 4 and when the instructor shared the architecture diagram for the streaming pipeline, I noticed the use of lambda functions for the model inference and stream transformation. I understand this is a lab but I am concerned if this is the most cost effective/efficient way to achieve this same task since lambda functions rack up costs on the number of times they are invoked and the duration they run. And we are looking at around 10,000 concurrent users(which can be higher) as the organization expands reach.

Please can you give more clarity on this maybe I am assuming wrongly and let me know why this approach suits our use case and also what possible cost-effective alternatives can be implemented to effectively achieve the same purpose.

I will share the architecture diagram for reference.

Georgios · January 5, 2025, 7:04pm

Hello @damolavictor,
Yes, you are correct and thinking about cost is a task as an Engineer to be considered. That is why Joe explains about serverless and containers advantages in week 3 of the course. Later in the specialization you will see different architectures (mostly medallion) which are more cost-effective than this one. I guess lambda functions are used for simplicity just to demonstrate how to implement this type of architecture. Hope it helps

damolavictor · January 5, 2025, 7:59pm

Thank you for this. Will be looking forward tot the upcoming sections to learn more about this

ocie · January 29, 2025, 12:13am

I’m also confused about this because the quiz earlier in that lesson rejected the use of Lambda. Maybe I misunderstood which part of the overall solution the answer was describing.

Georgios · January 29, 2025, 11:22am

Hello @ocie,
I think I understand which part confuses you. First in lambda architecture in the quiz the source system is sending data to both stream and batch.
Serverless now and AWS lambda runs a code in response to an event, executes small chunks of code on as-needed basis. You pay a little bit each time you code is run.
However it wouldn’t make sense to use if you are handling one event per function at a high event rate. As the OP suggested that could be catastrophically expensive. Hope it makes sense.

Topic		Replies	Views
C1W4 Lab: Paraphrase and help me understand the data directions in batch and stream processing Introduction to Data Engineering week-module-4 , coursera-platform	1	5	February 4, 2025
How to clear the resources and whats the cost? Serverless Agentic Workflows with Amazon Bedrock ai-discussions , dl-ai-learning-platform	0	15	March 21, 2025
What is the best way to get video signal from the field to the model on the cloud? AI Discussions ai-discussions	12	95	January 24, 2023
Building End-to-End Batch and Streaming Data Pipelines Based on Stakeholder Requirements Introduction to Data Engineering week-module-4 , coursera-platform	3	39	November 16, 2024
Deployment AI Discussions	7	60	October 15, 2023

Choice of using Lambda function in the Streaming pipeline

Related topics