For the lab (Neural Machine Translation with Attention), I’, trying to create a model that takes X (date in any format) then return Y (date in YYYY-MM-DD), same objective as the lab. However, for this I only want to build a simple architecture without using attention (simple encoder - decoder).
Is it possible to do it with out feeding one char at a time to the model? I mean feed the whole string? If so, how to perform that ?
Hi @M_jAd1
You can use an embedding layer to represent each character as a vector, then pass the entire sequence (date) into the encoder. The encoder will process the whole sequence at once and produce a context vector. The decoder will then generate the output sequence (YYYY-MM-DD format) from this context vector, step by step. It is worth mentioning that attention often helps with longer sequences.
Hope it helps! Feel free to ask if you need further assistance.
Thank you @Alireza_Saei
Is it somthing like this:
# Create the model
model = Sequential()
# Embedding layer
model.add(Embedding(input_dim=len(human_vocab), output_dim=64, input_length=30))
# Encoding layer (LSTM)
model.add(LSTM(128, return_sequences=False))
# Repeat vector to match the output sequence length
model.add(RepeatVector(10))
# Decoder LSTM layer (for each step of the output sequence)
model.add(LSTM(128, return_sequences=True))
# Decoding layer
model.add(TimeDistributed(Dense(11, activation='softmax')))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, Yoh_train, epochs=20, batch_size=64, validation_split=0.2)
I could’t get this to work. The X_train data is (9950, 30) (number for each char) from human_vocab.
The Yoh_train is a one-hot (9950, 10, 11) each char mapped from machine_vocab?