RNN Concepts too confusing

arvyzukai · July 31, 2023, 1:06pm

Have you checked this post because it addresses most of the questions? In this post you can see all the shapes (you can change batch dimension of 1 to any number without any problem).

But to try to answer them again:

If I understand you correctly, the lecture shows how you can take one input (x_t) (one word, the embedding of that word, for exampe, tensor of shape (1, 512)), concatenate it with previous hidden_state (h_{t-1}) (tensor of shape (1, 512)) and do one matrix multiplication with W_ht (which shape should be (1024, 512)). This way you do just one matrix multiplication which result is equivalent to having two (W_hx and W_hh) weight matrices and two inpus (x and h).
Here the batch_dimension is 1, but it could be any number.

In other words, shift right makes your input to [0,54,23] and you have to predict target [54,23,35]. And of course, your sequence length (max_len) is not 3 like in this example, but longer (like 64 in the assignment) , which usually does not cut off the last word, but some padding token.

You misunderstand this concept. This is considered one time-step. First layer receives and input, produces the output which is the input for the layer “above it”, that layer receives this input which is the output for another layer that is “above it”. This is one time-step. When all the layers finished, the next token is the input and the whole thing repeats for every time-step.

They are hard to interpret for humans. Some are obvious like punctuation or POS, some seam completely random. A classic post about character level RNNs and the features they learn (in the middle of the article)

Cheers

Topic		Replies	Views
Model architecture: Embedding dimension size and GRU number of cells NLP with Sequence Models week-module-2	8	1176	January 3, 2023
Creating a GRU model using Trax NLP with Sequence Models week-module-2	3	734	July 26, 2022
May I know what exactly does Tl.shiftright do? NLP with Sequence Models	3	290	November 25, 2021
Week1 building RNN step by step assignment - questions about input data dimension Sequence Models coursera-platform	7	659	July 6, 2021
RNN Shapes Clarification Sequence Models coursera-platform	2	536	July 4, 2022

RNN Concepts too confusing

Related topics