Looking at the videos on sequence to sequence model in C5W3. I wonder if chatGPT is a combination of a sequence to sequence model (to quick off and wrap up the responses) and some sort of one-to-many (probably with LSTM cells) in the middle, plus some regulator for response length (?)
It’s a very complicated example of a Transformer, which you see in C5 W4.
… will hold my horses until next week