Hi,
I found the C4W3 Fine Tuning Bert video to be confusing. The video quickly introduces several acronyms - MNLI, NER, and SQuAD - without explaining what they are. All three slides seem to effectively say the same thing, which is that fine-tuning is simply replacing the unlabeled pre-training data with some task-specific labeled data, and then the video abruptly ends saying that I now know how to fine-tune BERT. This felt… pretty light on the details? This video seems to communicate the same information about fine-turning as the earlier “Transfer Learning in NLP” video without really introducing new information.