Neural network architecture suggestion(s) for video prediction (image sequence)

Hello everybody! Could you please provide me neural network architecture suggestion(s) for video prediction (image sequence) regarding entering 144 images and predicting 48 images for each sequence? For illustration, I tested the Wavenet and CNNLSTM networks contained in Bitbucket (but not only them) and I didn’t get good results. Thanks in advance!

Try different architectures. Start from simple one and then make it complex, according to the results…