What you’ll learn in this course
Prompt engineering is used not only in text models but also in vision models. Depending on the vision model, they may use text prompts, but can also work with pixel coordinates, bounding boxes, or segmentation masks.
In this course, you’ll learn to prompt different vision models like Meta’s Segment Anything Model (SAM), a universal image segmentation model, OWL-ViT, a zero-shot object detection model, and Stable Diffusion 2.0, a widely used diffusion model. You’ll also use a fine-tuning technique called DreamBooth to tune a diffusion model to associate a text label with an object of your preference.