C5W2A2 Emojify V2

Hi All,
How is the output of the second LSTM contracted / shrinked from ten words to one value? I mean the following:

  • the input to the second LSTM is (None, 10, 128) where 10 is the maximum number of words per sentence
  • the output, however, is (None, 128)
    Seems like there’s one value created out from 10 words. If so, what is the rule?

1 Like

Does reading about return_sequences from here help?

Hi, it does give a hint :slight_smile: I will keep on reading what return sequences are. The info in the manual is pretty obscure… Thank you a lot!

1 Like

There is this comment in the template code:

# Propagate X trough another LSTM layer with 128-dimensional hidden state
# The returned output should be a single hidden state, not a batch of sequences.

Of course they misspelled “through” there (I’ll file a bug).

And then in the instructions later in the notebook under “What you should remember”, there is this statement:

* LSTM() has a flag called return_sequences to decide if you would like to return every hidden states or only the last one.

1 Like

By “the manual” I meant the tensorflow.org web page, not the comments in the Emojify assignment. For a lay person as myself “the last output in the output sequence” sounds like “the last sentence” or “the last word”. Thank you a lot.

1 Like

RIght, it just depends on what the elements of your sequence are. Depending on your application, it could be musical notes in a phrase or letters in a word or words in a sentence or sentences in a paragraph or paragraphs in a chapter or :nerd_face:

2 Likes

Like most technical documentation, the TensorFlow documents were written by experts, and only really make sense if you already know the material well enough that you don’t need to read the documents.

They’re not written as tutorials, sadly.

2 Likes

Indeed, the detailed API docs are really only accessible once you already know what you’re doing. There are so many of them and there are so many layers of inheritance that there’s just no way they could really explain everything at a tutorial level on each API page.

But the good news is that there are lots of introduction and tutorial pages on the TF site. E.g. here’s one from Francis Chollet that leads you through a bunch of the complex APIs and shows how to make transfer learning work. Not directly relevant to the question here, of course, but just a demo of what is out there.

So it’s an issue of where you start on a given question. You have to make your own judgement about whether looking at the detailed API docs is the place you should start. Or you try that first and when it sounds like Greek, you interpret that as a message that you need to look for a higher level document.

1 Like