Week 1 - Case Study Speech Recognition - Data Step

In this slide, the instructor only mentioned about the “Define data” step. For the “Label and organize data” step, is it only for collecting additional data if required?

I am looking for the exact section to review and answer. May be if you can include the link to the exact video, it will be helpful.

The link is
https://www.coursera.org/learn/introduction-to-machine-learning-in-production/lecture/bwXgc/case-study-speech-recognition 1.55m through 5.28m

It is also about ‘label’-ing. It is about coming up with consistent labeling methodology. All of the talk is for both “define data & label”

Hope that helps.

@satishnandi: Thank you for your quick response!

It appears to me that the examples don’t really justify the separation of the substeps ‘Define data & establish baseline’ and ‘label & organize data’. I think we should either revise the framework or add a clearer example.

Edit: The instructor says he will discuss more about the data step in week 3. I will finish the course and update on this later.