How nowadays real, big models work?

pierre_bedu · June 29, 2023, 2:47pm

Super great course! Thanks a lot.

I 'm trying to build a modern diffusion model and i have some questions :
1/ what problem do bigger images (500*500) can cause? (slower training of course and ?)

2/ what network is used for 2023 models (dallE, midjourney…) instead of Unet? I guess transformers. Waht type? Where in the architecture would you insert the context and time embedding ?

3/ For contextual embeddings with captions of type “a potato with big wheels and painted in red” you need to embed the caption in one vector. I guess the embedding used is a pretrained static one. Which one is it ? and how do you embedd a full sentence or text (more difficult than embedding a single world)

Topic		Replies	Views
Diffusion Transformed: A new class of diffusion models based on the transformer architecture AI Discussions the-batch , ai-discussions	0	123	August 11, 2023
[Week 2] - Embedding and Transfer Learning Sequence Models coursera-platform	6	613	May 24, 2021
What are the adjustment to be done for training larger image? How Diffusion Models Work	0	118	July 17, 2023
Sora, Runway, etc (How does it work ?) How Diffusion Models Work	0	91	May 9, 2024
Diffusion Model Evaluation Metrics How Diffusion Models Work generative-ai	0	14	February 3, 2025

How nowadays real, big models work?

Related topics