Week-3: Session - More label ambiguity

Jeet · May 17, 2021, 8:47am

Hi,

At time-stamp 2:48 “User ID merge example” -
I understand we can manually try to compare whether two datasets belong to same personnel – but how to perform this task in real-life ? say where there are thousands / millions of records to compare ?
Are there any tools / libraries to do that ?

Thank You.

satishnandi · May 19, 2021, 11:43pm

The Prof is exampling how the merge can be done. You will have to have a smaller Labeled dataset. You will train the ML model with this labeled data & use the trained model to deal with larger dataset.

apolanco3225 · May 20, 2021, 2:53am

Hi @Jeet ,

Unfortunately, this is not my strongest field of expertise. By searching on google I found this interesting article, perhaps it would be useful to you. Depending on the amount of data, using humans to do this task could become impossible, in the video they mentioned a supervised learning algorithm instead.

Best, Arturo

Topic		Replies	Views
Question: week 1, steps of an ML project -2.30 min Machine Learning in Production	2	589	May 17, 2021
C3_W2_RecSysNN_Assignment_Dataset Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	488	December 29, 2022
Week 1 - Case Study Speech Recognition - Data Step Machine Learning in Production	3	629	May 15, 2021
Course 1- week 3 - label consistency: unintelligible tag Machine Learning in Production	1	588	May 19, 2021
Course 3 Week 2 - Cleaning Up Incorrectly Labeled Data Structuring Machine Learning Projects coursera-platform	1	524	October 7, 2022

Week-3: Session - More label ambiguity

Related topics