Can someone explain to me why is the noise dimension 64? Thank you!
That is just an arbitrary choice they made. What is usually called a “hyperparameter”, as opposed to a “parameter”, which can be learned through training. They probably tried smaller and larger values and maybe (just my guess) you would get less variety with say 32 noise values, but with larger values you can’t perceive any further increase in variety. So 64 is the “Goldilocks” value.
But thinking just epsilon harder, maybe the intuitions are a little different here. The other thing to consider with most “hyperparameter” choices is cost in terms of cpu and memory. What we are doing here is “inflating” the noise into a synthetic image. So if you start with a smaller noise value, then maybe you need one more layer of inflation (one more transposed conv layer), so that actually ends up being more expensive in terms of runtime and memory to store the extra layer of parameters.
Everything here is an experimental science. Try implementing it with 32 or 128 and see what you find.