Residual Fully Connected Block and skip connections

I was reading a paper Simple is Better: Training an End-to-end Contract Bridge Bidding Agent without Human Knowledge by Qucheng Gong, Tina Jiang and Yuandong Tian. In the paper they described a network that consisted of an initial fully connected layer, then 4 fully connected layers with skip connections added every 2 layers to get a latent representation. … . The full network architecture is shown In Figure 2. I am unclear what the skip connection does in this case. What layer does the skip connection skip ? A high level keras description of the network would help me understand what is going on and how this helps in the classification task.

I don’t know about this book, but a skip connection or a residual connection it takes the output of a neuron from some previous layer ( not just the previous one) and feeds it to the current neuron with the previous layer’s feed in addition, look at this images:

And here is a coding example of a residual block in TensorFlow:


from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, ReLU, Add
from tensorflow.keras.models import Model

def residual_block(x, filters, kernel_size=3, stride=1):
# First convolutional layer
conv1 = Conv2D(filters, kernel_size=kernel_size, strides=stride, padding=‘same’)(x)
bn1 = BatchNormalization()(conv1)
relu1 = ReLU()(bn1)

# Second convolutional layer
conv2 = Conv2D(filters, kernel_size=kernel_size, strides=1, padding='same')(relu1)
bn2 = BatchNormalization()(conv2)

# Add the skip connection
skip_connection = Add()([bn2, x])
relu2 = ReLU()(skip_connection)

return relu2

Example usage

input_shape = (32, 32, 64) # Example input shape (height, width, channels)
inputs = Input(shape=input_shape)

Create a single residual block

outputs = residual_block(inputs, filters=64)

Build the model

model = Model(inputs=inputs, outputs=outputs)

Summary of the model

model.summary()


1 Like

Thanks that helps but I still have trouble applying this to the network I am trying to understand. Here is the depiction of the Residual Fully Connected Block

I have found the figure from the paper. The input is a 267 bit vector. And the for the output, one branch is to a policy head. It is a fully connected layer to 38 output neurons, masking out illegal actions provided in the input (the last 38 input bits), and then normalizes to a log policy. The other branch is a value head, which is just a fully connected layer to 1 neuron.

If I understand the circle with a + is the skip connection and that the other open circle is a ReLU function then the RFC might be represented by the following keras code snippet

x = input
skip = x
x = Dense(hidden_features)(x)
x = Dense(hidden_features)(x)
x = ReLU()(x)
x = Dense(hidden_features)(x)
x = Dense(hidden_features)(x)
x = ReLU()(x)
x = Concatenate()([x, skip])

Not sure if that makes sense. Comments ?

In the version of Residual Nets and skip connections that I’ve seen in DLS Course 4 Week 2, the operation you show as Concatenate is an addition.

X = Add([x, skip])

The fact that they use the “+” symbol suggests that they have the same interpretation. That is also the way Gent showed it in his example. Of course the other issue implicit in this is you have set the input and output neurons on each layer in a way such that dimensions match at the point of the addition.

You might find it useful to at least watch the lectures about Residual Networks from Prof Ng in DLS C4. You can do that for free in “audit” mode. A lot of his lectures are on YouTube as well, although I have not taken the trouble to search for his Residual Net lectures there.

Ok so to use the add x and skip would need the same shape. From the previous I think I had one too many Dense layers. Maybe

x = Dense(hidden_features)(x)
skip = x
x = Dense(hidden_features)(x)
x = ReLU()(x)
x = Dense(hidden_features)(x)
x = ReLU()(x)
x = Add()([x, skip])