How will the output Y hat be in the first instance from the discriminator when only X hat and X are fed into the discriminator?

Sharon says the target label is used only in computing the cost function but, what output will the discriminator produce?

The discriminator produces predictions of whether a given input is fake or real. The key point is that you feed the discriminator two sets of inputs: real samples and fake samples (outputs of the generator). Then there are two terms in the cost function for the discriminator which take those two types of samples into account (handle them differently). Please take another look at how the cost function for the discriminator is defined with what I said above in mind and it should make sense.