Question about input/output of machine translation model (Week 3)

In the description below for Exercise 3:

  • You have input X of shape (m = 10000, T_x = 30) containing the training examples.
  • Given the model() you coded, you need the “outputs” to be a list of 10 elements of shape (m, T_y).
    • The list outputs[i][0], ..., outputs[i][Ty] represents the true labels (characters) corresponding to the i^{th} training example (X[i]).
    • outputs[i][j] is the true label of the j^{th} character in the i^{th} training example.

I’m confused by the description above, I thought the input to the model should be Xoh with shape (m, Tx, len(human\_vocab)) and outputs should be a list of length 10, where outputs[i] is a 2D array of shape (m, len(machine\_vocab))?

1 Like

You are right. What you need to input is one-hot encoded and what you get as output is one-hot encoded. This is better understood when examining the test code below (see my comments):

EXAMPLES = ['3 May 1979', '5 April 09', '21th of August 2016', 'Tue 10 Jul 2007', 'Saturday May 9 2018', 'March 3 2001', 'March 3rd 2001', '1 March 2001']
s00 = np.zeros((1, n_s))
c00 = np.zeros((1, n_s))
for example in EXAMPLES:
    source = string_to_int(example, Tx, human_vocab)
    #print(source)
    source = np.array(list(map(lambda x: to_categorical(x, num_classes=len(human_vocab)), source))).swapaxes(0,1) # do one-hot encoding
    source = np.swapaxes(source, 0, 1)
    source = np.expand_dims(source, axis=0)
    prediction = model.predict([source, s00, c00])
    prediction = np.argmax(prediction, axis = -1)
    output = [inv_machine_vocab[int(i)] for i in prediction] # reverse one-hot encoding
    print("source:", example)
    print("output:", ''.join(output),"\n")

I will report this upstream. Thank you.

1 Like