Meta’s Llama 3.2 goes multimodal

Community-Team · September 27, 2024, 6:22pm

Meta released Llama 3.2, a family of vision-capable large language models and lightweight text models for edge devices. The new lineup includes 11 billion and 90 billion parameter multimodal models that can reason about images, outperforming Claude 3 Haiku and GPT4-mini on visual understanding tasks. Meta also launched 1 billion and 3 billion parameter models optimized for on-device use with 128K token context lengths, with the 3B model outperforming Gemma 2 2.6B and Phi 3.5-mini on tasks like instruction following and summarization. The company also introduced Llama Stack distributions to simplify deployment across various environments, and updated safety tools, including Llama Guard 3 for vision tasks. (Meta AI)

Topic		Replies	Views
✨ New course! Enroll in Introducing Multimodal Llama 3.2 News and Announcements short-course , meta , dl-ai-learning-platform	4	431	October 11, 2024
Meta shrinks Llama models for faster on-device AI AI Discussions ai-discussions , data-points	1	78	October 28, 2024
Runaway LLaMA: How Meta's LLaMA NLP model leaked AI Discussions the-batch , ai-discussions	1	87	May 20, 2023
Meta releases Llama 4 models, claiming superior performance AI Discussions ai-discussions , data-points	3	182	April 8, 2025
Llama 3.2 finetuning and evaluations? Introducing Multimodal Llama 3.2	6	103	October 18, 2024

Meta’s Llama 3.2 goes multimodal

Related topics