Hi All,
As a fun and educational activity, I’ve been working on a spaceship name generator based on Iain M Banks Culture series and wanted to ask for some help as I’ve hit a bit of a brick wall.
Context
The Culture series of Science Fiction books features ultra-intelligent AI minds who inhabit spaceships that romp around having adventures in a sort of space Utopia. These ships are named by the AIs themselves, and tend to have funny or witty names:
Example names from the Culture Series
- Lapsed Pacifist
- A Fine Disregard for Awkward Facts
- Transient Atmospheric Phenomenon
The idea behind the model is to generate new names along the same lines, while also giving me the opportunity to explore some ML techniques!
The Model
I have a cleaned-up dataset of 196 names (specifically, short English language phrases) from the Culture series, and I’ve trained a Bidirectional LSTM model using sequences of GloVe word embeddings (glove.6B.50d.txt). I’ve included the Keras code for the core architecture at the end of this post.
Performance
After training for 150 epochs, the accuracy is high (> 90%) and loss is low compared to the initial training period (0.2 compared to 12+) - though I’ve struggled to have a strong intuition of these numbers given the nature of the problem.
I’m generating names using a similar method to the Dinosaur name generator from the Sequence Models course, though I’ve added some randomness by randomly substituting similar words based on embedding proximity.
The resulting model is clearly in the right sort of vicinity, but I think it’s a long way off from generating genuinely interesting / novel names:
- Sound Nonsense in Awkward Applications of Stick
- Pacifist Cause Fears Rapture Truth Case
Next Steps / How to Help
I’d really appreciate suggestions for what I can do next to improve performance - including a total redesign if necessary!
My current thinking is that the lack of training samples is likely one of the biggest issues, as 196 short phrases from English seems like a very small dataset.
I’m also wondering if there may be a better method for generating names. I’m currently taking a seed word then selecting the next word randomly based on the softmax probability distribution. I’ve noticed that if I use “argmax” then the model is quite good at regenerating parts of the original names (eg “a” yields “a momentary lapse of”)…
But ultimately I’m open to ideas or other forms of help. Please feel free to drop a message here or otherwise reach out if you’d like to help out!
The follow code sample provides a Neural Network architecture summary:
embedding_layer = Embedding(input_dim=vocab_size, output_dim=embedding_dim, weights=[embedding_matrix], input_length=max_sequence_length, trainable=False)
embeddings = embedding_layer(sentence_indices)
X = Bidirectional(LSTM(units=lstm_units, return_sequences=True))(embeddings)
X = Bidirectional(LSTM(units=lstm_units, return_sequences=False))(X)
X = Dense(units=128, activation='relu')(X)
X = Dense(units=vocab_size, activation='softmax')(X)