I prepared the Data as follows:
X …> a vector with shape (19909, 1, 27) each char encoded as One-Hot.
Y …> a vector with shape (19909, 27) matching X, but shifted to the left one position.
First, create a seed with a one-hot encoded vector representing a starting character. Then, iteratively use the model to predict the next character based on the current input (update the input with the predicted character each time). Repeat this process until you reach the name length you want.
Hope it helps! Feel free to ask if you need further help.
rand_char = random.choice(X2)
pred = model2.predict(rand_char.reshape(1, 1, 27))
name = ''
for _ in range(20):
i = np.argmax(pred)
name += ix_to_char[i]
#feed the pred again to the model
pred = model2.predict(pred.reshape(1, 1, 27))
print(name)
But got the output as follows: szwazwazwazwazwazwaz
Any recommendations on how to modify the code above?
Always choosing the most probable word/character (argmax) can cause this issue, you must avoid this method if you want to prevent repetitive characters/pattern. Also, make sure your model is implemented correctly and probabilities are reasonable.
Instead of reshaping pred for the next prediction, use the predicted index i to create a new one-hot input vector that shows the selected character. Update the input with this new vector and use it for the next prediction.