Confirming About Training Set and Test Set Week 2 Part Working With an Ai Team

Link to the class room : https://www.coursera.org/learn/ai-for-everyone/lecture/74dmT/working-with-an-ai-team

Hi there greetings,

i would like to questions about training set and test set

from what i understand the training set team would have a harder task than test set team, the training set team would be preparing algorithm, dataset, while test set team just receive the already made algorithm, preparing data set and checking the accuracy and report back to the ai team

               πŸ—οΈ Training Phase (Harder Job)  

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ :one: Collect and clean data β”‚
β”‚ :two: Label data (Input A β†’ Output B) β”‚
β”‚ :three: Choose model & train it β”‚
β”‚ :four: Tune parameters & optimize β”‚
β”‚ :five: Finalize trained model :white_check_mark: β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
:down_arrow: Model is now trained

               🎯 Testing Phase (Easier Job)  

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ :one: Use trained model on new test data β”‚
β”‚ :two: Measure accuracy and errors :bar_chart: β”‚
β”‚ :three: Report results to training team β”‚
β”‚ :four: If bad accuracy, training team fixes β”‚
β”‚ :five: Repeat with improved model :white_check_mark: β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

if this is true, how could this categorism is actually be made ? how is this fair ?

Hi Mikha, and welcome to the Forum! First, these two phases (training and evaluation of a machine learning model) can be carried out by the same team. They don’t have to be separate, and I think it’s more typical to have the same team work on both these phases. So after a team trains their model, they will evaluate it on a test set so they’ll know what to expect on real world data. They will report this to other stakeholders who will use this model. As you might be implying, this evaluation should be quick and easy because the model will just consume a test set that is much smaller than the training set. So in that regards, this is fair because the same team will execute these.

What might need more effort or additional personnel is when this model is deployed on real world data. There, you might need engineers or developers to make your model easy to use for your users (e.g. creating a mobile app with voice recognition). This might also include storing new incoming data so it can be used as the new training and test datasets for the next model iteration.

Hope this helps!