Hello, can someone help provide more understanding about the weight parameters Wy and Wa, and bias by and ba? What do they do? How do you determine these parameters? It also mentioned something about gradient vanishing. These are mentioned in Sequence models as part of Deep learning specialization. I jumped directly into the sequence models and will those parameter definition and vanishing gradient introduced in previous courses? Thanks!
I moved your thread to the category of Deep Learning Specialization Course 5.
I recommend you go back to Course 1 and 2 where these concepts are introduced.