C3W1 GenerativeModel question

@arvyzukai (plus anyone else of course)

I got the solution but in trying to understand the code better, I added print statements and the output is confusing.

  1. In generate_one_step method I added lines like:
    print(‘Entering generate_one_step’)
    print('states: '.format(states))
    input_ids = line_to_tensor(line=inputs, vocab=self.vocab)
    print(‘inputs: {}’.format(inputs))

    print(‘Exiting generate_one_step’)

  2. In generate_n_chars I added print lines as follows:
    for n in range(num_chars):
    print(‘n = {}’.format(n))
    print(‘result = {}’.format(result))
    next_char, states = self.generate_one_step(next_char, states=states)
    result.append(next_char)

I invoke it as follows:
tf.random.set_seed(272)
gen = GenerativeModel(model, vocab, temperature=0.5)
print(gen.generate_n_chars(8, " "), ‘\n\n’ + ‘_’*80)

  1. Execution trace via print statements is VERY confusing:
    n = 0
    result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>]
    Entering generate_one_step
    Exiting generate_one_step
    n = 1
    result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>]
    Entering generate_one_step
    Exiting generate_one_step
    n = 2
    result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>]
    n = 3
    result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>]
    n = 4
    result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’r’], dtype=object)>]
    … and so on

For n=0, n=1 I see the corresponding entering/exiting generate_one_step print output but after that I do not see the enter/exit statements for higher values of n.

Any ideas as to what might be going on? I would appreciate any explanation that you can provide

Thank you.

Hi @Cawnpore_Charlie

In short, TensorFlow tf.function make graphs out of your programs (converts a Python function into a TensorFlow graph for better performance). Here are the basics of it, more on that and even more.

In other words, for efficiency, Python print statements are often omitted, Python string operations (like .format()) too. During the first call, TensorFlow is setting up the computation graph, and hence you see the prints.
I’m not a big fan of TensorFlow and my knowledge is limited, but if you just want to see what’s inside, modify your prints to:

        tf.print('Entering generate_one_step')
        tf.print(states)
        input_ids = line_to_tensor(inputs, self.vocab)
        tf.print(inputs)
...
        tf.print('Exiting generate_one_step')

Cheers

1 Like

Thank you - will check out the references you listed - very surprising behavior!

I will try out the tf.print statements!

Thanks, again.