C3W1 GenerativeModel question

Cawnpore_Charlie · February 1, 2024, 2:28am

@arvyzukai (plus anyone else of course)

I got the solution but in trying to understand the code better, I added print statements and the output is confusing.

In generate_one_step method I added lines like:
print(‘Entering generate_one_step’)
print('states: '.format(states))
input_ids = line_to_tensor(line=inputs, vocab=self.vocab)
print(‘inputs: {}’.format(inputs))
…
print(‘Exiting generate_one_step’)
In generate_n_chars I added print lines as follows:
for n in range(num_chars):
print(‘n = {}’.format(n))
print(‘result = {}’.format(result))
next_char, states = self.generate_one_step(next_char, states=states)
result.append(next_char)

I invoke it as follows:
tf.random.set_seed(272)
gen = GenerativeModel(model, vocab, temperature=0.5)
print(gen.generate_n_chars(8, " "), ‘\n\n’ + ‘_’*80)

Execution trace via print statements is VERY confusing:
n = 0
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>]
Entering generate_one_step
Exiting generate_one_step
n = 1
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>]
Entering generate_one_step
Exiting generate_one_step
n = 2
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>]
n = 3
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>]
n = 4
result = [<tf.Tensor: shape=(1,), dtype=string, numpy=array([b’ ‘], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’h’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’e’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’a’], dtype=object)>, <tf.Tensor: shape=(1,), dtype=string, numpy=array([b’r’], dtype=object)>]
… and so on

For n=0, n=1 I see the corresponding entering/exiting generate_one_step print output but after that I do not see the enter/exit statements for higher values of n.

Any ideas as to what might be going on? I would appreciate any explanation that you can provide

Thank you.

arvyzukai · February 1, 2024, 6:40am

Hi @Cawnpore_Charlie

In short, TensorFlow tf.function make graphs out of your programs (converts a Python function into a TensorFlow graph for better performance). Here are the basics of it, more on that and even more.

In other words, for efficiency, Python print statements are often omitted, Python string operations (like .format()) too. During the first call, TensorFlow is setting up the computation graph, and hence you see the prints.
I’m not a big fan of TensorFlow and my knowledge is limited, but if you just want to see what’s inside, modify your prints to:

        tf.print('Entering generate_one_step')
        tf.print(states)
        input_ids = line_to_tensor(inputs, self.vocab)
        tf.print(inputs)
...
        tf.print('Exiting generate_one_step')

Cheers

Cawnpore_Charlie · February 1, 2024, 6:42am

Thank you - will check out the references you listed - very surprising behavior!

I will try out the tf.print statements!

Thanks, again.

Topic		Replies	Views
NLP C3 W1 Assignment E6 GenerativeModel NLP with Sequence Models week-1	2	296	May 18, 2024
C3 week1f.function def generate_one_step(self, inputs, states=None): """ Generate a single character and update the model state NLP with Sequence Models week-1	2	141	May 17, 2024
Trouble with logic and syntax of generator NLP with Sequence Models week-1	2	542	January 27, 2022
C4W1_Assignment - Translate Function NLP with Attention Models week-1	5	450	March 14, 2024
NLP - C4W2 - TF/Python Question NLP with Attention Models week-2	6	16	August 27, 2024

C3W1 GenerativeModel question

Related topics