@arvyzukai (plus anyone else of course)
I got the solution but in trying to understand the code better, I added print statements and the output is confusing.
-
In generate_one_step method I added lines like:
print(‘Entering generate_one_step’)
print('states: '.format(states))
input_ids = line_to_tensor(line=inputs, vocab=self.vocab)
print(‘inputs: {}’.format(inputs))
…
print(‘Exiting generate_one_step’) -
In generate_n_chars I added print lines as follows:
for n in range(num_chars):
print(‘n = {}’.format(n))
print(‘result = {}’.format(result))
next_char, states = self.generate_one_step(next_char, states=states)
result.append(next_char)
I invoke it as follows:
tf.random.set_seed(272)
gen = GenerativeModel(model, vocab, temperature=0.5)
print(gen.generate_n_chars(8, " "), ‘\n\n’ + ‘_’*80)
- Execution trace via print statements is VERY confusing:
n = 0
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>]
Entering generate_one_step
Exiting generate_one_step
n = 1
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>]
Entering generate_one_step
Exiting generate_one_step
n = 2
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>]
n = 3
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>]
n = 4
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’r’], dtype=object)>]
… and so on
For n=0, n=1 I see the corresponding entering/exiting generate_one_step print output but after that I do not see the enter/exit statements for higher values of n.
Any ideas as to what might be going on? I would appreciate any explanation that you can provide
Thank you.