Machine Translation With Attention - Practice problem

M_jAd1 · September 18, 2024, 4:36am

For the lab (Neural Machine Translation with Attention), I’, trying to create a model that takes X (date in any format) then return Y (date in YYYY-MM-DD), same objective as the lab. However, for this I only want to build a simple architecture without using attention (simple encoder - decoder).
Is it possible to do it with out feeding one char at a time to the model? I mean feed the whole string? If so, how to perform that ?

Alireza_Saei · September 18, 2024, 6:20am

Hi @M_jAd1

You can use an embedding layer to represent each character as a vector, then pass the entire sequence (date) into the encoder. The encoder will process the whole sequence at once and produce a context vector. The decoder will then generate the output sequence (YYYY-MM-DD format) from this context vector, step by step. It is worth mentioning that attention often helps with longer sequences.

Hope it helps! Feel free to ask if you need further assistance.

M_jAd1 · September 18, 2024, 10:27am

Thank you @Alireza_Saei

Is it somthing like this:

# Create the model
model = Sequential()

# Embedding layer
model.add(Embedding(input_dim=len(human_vocab), output_dim=64, input_length=30))

# Encoding layer (LSTM)
model.add(LSTM(128, return_sequences=False))


# Repeat vector to match the output sequence length 
model.add(RepeatVector(10))


# Decoder LSTM layer (for each step of the output sequence)
model.add(LSTM(128, return_sequences=True))

# Decoding layer
model.add(TimeDistributed(Dense(11, activation='softmax')))


model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])


history = model.fit(X_train, Yoh_train, epochs=20, batch_size=64, validation_split=0.2)

I could’t get this to work. The X_train data is (9950, 30) (number for each char) from human_vocab.
The Yoh_train is a one-hot (9950, 10, 11) each char mapped from machine_vocab?

Topic		Replies	Views
Week 3, Neural_machine_translation_with_attention Sequence Models coursera-platform	2	583	July 3, 2021
Inputs to lstm layer: Sequence Models coursera-platform	1	557	July 6, 2022
Attention sequence model week3(make post attention steps output depend on prior step) Deep Learning Resources	1	91	October 7, 2022
Video: NMT Model with Attention NLP with Attention Models week-1	5	382	December 21, 2023
[Week 3] Machine Translation: Code help Sequence Models coursera-platform	5	452	June 7, 2023

Machine Translation With Attention - Practice problem

Related topics