Seeking Advice in Generative AI model

Recently, I fine-tuned Stable Diffusion XL to generate building images. I created a dataset with 11,000 images of buildings, each paired with a corresponding prompt. However, about 90% of the images have a resolution below 1024x1024, mostly ranging from 500x500 to 700x700.

During inference, even though I generate images at 1024x1024 resolution, the quality is very poor. The images appear grainy and lack detail.

Do you guys think the problem lie solely on my dataset or something else that i dont know?