C3W3: exercise 4 Classify

Hi @user3141

This is a good question.

You are correct and note that 256 is because the model concatenates outputs of both branches, as per code hints for Siamese implementation:

# Define the Concatenate layer. You should concatenate columns, you can fix this using the `axis`parameter. 
# This layer is applied over the outputs of each branch of the Siamese network

In other words, there are 10240 rows (question pairs) with v1 ad v2 which represent the questions - each 128 dimensional, and they are concatenated into 256 vector. Now, your question is how to separate them (out of the v1v2 variable).

For hints, you can look at the TripletLoss function above which is already provided for you:

    v1 = out[:,:int(embedding_size/2)] # Extract v1 from out
    v2 = out[:,int(embedding_size/2):] # Extract v2 from out

The difference is that they used embedding_size variable there (which is a bit inaccurate) and we use the n_feat variable name (which is a better choice for the name) but in essence they are the same - the dimension of the concatenated vectors (of the averaged and normalized LSTM outputs).

So in summary, you can do what was previously done in the TripletLoss function but with the right variable names.

Cheers

2 Likes