Parameter size graphic question

In the introduction slides there is a graphic showing Large Language Models by parameter size. However BERT at 110M parameters is nearly the same size as PaLM at 540B parameters. Are these the right numbers or is the graphic representing something else?

Hi @Ken_DeVoe ,

Actually BERT has several model sizes. I don’t recall if the graphic shows which model of BERT they are representing, but you can have BERTS starting at 110MM and up to 340MM, but you are right anyways. No comparison with a 540B parameters! it may be a graphic typo :slight_smile:

Thank you! That makes sense there are different versions as well. A later slide is more specific and calls out BERT-base (110M) vs Bloom-175B with the graphic comparison. Really cool to see that as it gives a mental picture for just how big some of these models are in comparison to BERT.

I’m not sure what you mean by the right numbers.

What @Ken_DeVoe was suspecting had to do with the numbers presented in the slide, because of the relative sizes of the circles used on each image. But all is clear now :slight_smile:

@Ken_DeVoe please correct me if I’m wrong.


Correct. I was just confused by the sizes of the circles compared to what parameter numbers I could find for models like BERT and PaLM. But if it is a typo on the graphic that makes sense to me.