[week 2] alpaca_model. Why model2.layers[4], a magic number

This question includes none code that supposed to be fulfilled by students. Only those codes from teacher’s version for the purpose of demonstration.

In the beginning of exercise 3, the code goes

image

arrr ha. A magic number 4.

I’m new to tensorflow. It costs me quite some time to figure out what this is about. It turns out to be connected to the alpaca_model function defined above, at least it is what I think.

So I checked the definition and I found this.

The reason for me to think so comes from the print information I got. In order to see what the hell is going on, I inserted a print function right after the first line. you can see it in the first picture.

The information I got are listed below.
image
Inputs corresponds to model2.layers[0].
data augmentation links to model2.layer[1].

Just when I was thinking this is it, I got the reason for the magic number 4. Then I got a hard blow.
image
image

Obviously, model2.layers[4] links to the pretrained model, because teacher’s version says so. Yet according to the picture posted above, the number should be 3. I really want to know what’s going on here.

I think those “no attribute” errors are a signal that you need to restart the kernel and re-run all of the cells in the notebook.

I recommend you look in the output for the command “summary(model2)” earlier in the notebook.

@TMosh : You ignored the root question which was: what’s up with the magic number 4??
I think the analysis made by @yuyang was extremely insightful, thanks for actually answering his questions @TMosh

Not ignored. Hinted at the answer via using the summary() report.

Thanks. It’s a valuable hint. But still needs further clearance.

This is the model2.summary() information.

Precisely as analyzed.

position 0                'InputLayer', [(None, 160, 160, 3)], 0],
position 1                ['Sequential', (None, 160, 160, 3), 0],
position 2                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],
position 3                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],
position 4                ['Functional', (None, 5, 5, 1280), 2257984],
position 5                ['GlobalAveragePooling2D', (None, 1280), 0],
position 6                ['Dropout', (None, 1280), 0, 0.2],
position -1                ['Dense', (None, 1), 1281, 'linear']] #linear is the default activation

It has the same information with what I got with print function. I still don’t understand why 1 line in the definition

data preprocessing using the same weights the model was trained on

x = 

could results in 2 lines in the summary()

position 2                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],
position 3                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],

There are two layers added by the data augmentation process, those are the ones labeled
“TensorFlowOpLayer”. I think those are messing up your layer counting.

I think it is not the reason. When I print model2.layer[1], which is the data augmentation, it gives information that this layer has two tensorflow object inside it, and the length of it is 2.

If the problem should occur here, that the wrapped 2 layers messed with my counting, then model2.layer[2] should also links to data augmentation, which it is not.

If this is not convincing enough, let’s see model2.layers[4]. It prints 155 tensorflow objects and gives a length of 155.

If the problem should occur with 2 wrapped layers inside data augmentation, then here at model2.layers[4], the wrapped 155 layers would mess with the next counting at model2.layers[5], which it is not.
image

I think I’ll wait for a TF expert to stop by and comment on this further.

I think the confusion lies in an incorrect assumption about how indexing works. You are indexing at the “outer” level here. If one of the objects at one of the earlier positions is itself a compound object with multiple layers, that is irrelevant. From the POV of the outer level of indexing, it counts as one element. Or to put this another way, I can have a “list of lists”, right? Which list is the 3rd list in the list of lists is independent of how many elements the first list in the list of lists has. Why is that hard to understand? :laughing:

The information is all there in this listing in your earlier post:

position 0                'InputLayer', [(None, 160, 160, 3)], 0],
position 1                ['Sequential', (None, 160, 160, 3), 0],
position 2                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],
position 3                ['TensorFlowOpLayer', [(None, 160, 160, 3)], 0],
position 4                ['Functional', (None, 5, 5, 1280), 2257984],
position 5                ['GlobalAveragePooling2D', (None, 1280), 0],
position 6                ['Dropout', (None, 1280), 0, 0.2],
position -1                ['Dense', (None, 1), 1281, 'linear']] #linear is the default activation

So the 5th element (index 4) is this one:

[‘Functional’, (None, 5, 5, 1280), 2257984],

That is a Keras Functional object which (as you mentioned) itself has 155 layers. It’s the actual pretrained MobilNet model that was imported and then included at that point in the alpaca_model logic.

You can look back at the logic that you wrote to construct the alpaca_model and figure out what is happening. The “Sequential” layer that is index 1 on the list is the data augmentation output (a Sequential model with 2 layers). You can prove to yourself that the following two TensorFlowOp layers are created by the preprocess_input function by commenting that out in the alpaca_model logic and then printing the layers of the resultant model by commenting out the invocation of the comparator so that it doesn’t “throw”. So the reason that your original enumeration did not make sense is that it turns out that the function at index 2 returns two layers. That function is imported from Keras and here’s the docpage. You can see it scales the pixel values to be between -1 and 1, so it involves two steps: subtraction and then division.

So, yes, 4 is a magic number. But it’s pretty clear why they set it to that value, right? They know the structure of what they created and they are selecting the included model.

2 Likes

Very helpful - thank you!