Week1, before exercise 1, How to open the file subwords

Is there any way to see the subwords look like?

Hi @Amazing_Patrick

A simple way is to just open the “ende_32k.subword” with some text editor (like Notepad on Windows). Each line number is the index of the subword. For example, index 0 is ‘<pad>', 1 is '<EOS>’, 2 is ‘, _’ etc.

Or you can just play with provided tokenize and detokenize functions:

Thank you so much! Do you know where to open the ende_32k.subword file? Will it download into our local computer?

Yes, of course.

  1. Open your assignment/lecture notebook.
  2. Go to top left, “File → Open…”
  3. Click on “data” directory
  4. Check “ende_32k.subword” and a bit higher the “Download” button appears. (You can also click on the “ende_32k.subword” if you want to view the contents in your browser).

Cheers

Got it! Thank you so much!