Encoder Layer Mask Purpose

Thank You sir.

Also one more doubt why are we passing Query = Value = Key = X ? Why we are not doing this query = W * X , value = W * X, key = W * X