hi,
I am trying to solve an aspect based sentiment analysis problem.
I have constructed an NN architecture which I believe is logical but had problems in finding a right loss function. or maybe my architecture is wrong and you may help me with that. Here is the set up.
I have a training set of 980 rows which has different (turkish) sentences in each line. It also contains labels as one or more aspects and the sentiments for each of these aspects. so one sentence can have more than one aspect.
There are 930 different aspects in the training set.
I am thinking of using lstm architecture by devising the problem as a multi label multi class problem.
for each sentence input, there will be 910 neurons at the output layer for each aspect and the nuerons will have 4 different classes(1 if the sentiment is positive 2 if negative 3 if neutral and 0 if the aspect doesnt exist for this sentence)
so it esssentially looks like a softmax -categorical cross entropy problem
but the problem is the classes are not binary. they can take 4 different values.
that is why a softmax activation function with categorical cross entropy as loss does not work.
I tried sparse categorical cross entropy as loss function but it also did not work. It gives an error. Maybe a small editing would solve the problem but maybe there is a problem with my architecture for reference here is my code for the model
model = Sequential()
model.add(Embedding(len(tokenizer.word_index) + 1, 100, input_length=max_sequence_length))
model.add(Bidirectional(LSTM(128, return_sequences=True)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(512, activation=âreluâ))
model.add(Dense(512, activation=âreluâ))
model.add(Dense(930, activation=âsoftmaxâ))
model.compile(loss=âSparseCategoricalCrossentropyâ, optimizer=âadamâ, metrics=[âcategorical_accuracyâ])
batch_size = 64
epochs = 100
model.fit(X_train, y_train, batch_size=2, epochs=epochs)
when I run the model it gives the following error
InvalidArgumentError: Graph execution error:
logits and labels must have the same first dimension, got logits shape [64,910] and labels shape [58240]
[[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_135707]
[[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_11542]
the interesting thing is that when I use âcategoricalcrossentropyâ it does not raise error but when I change it to sparsecategoricalentropy, it raises the above error.
what do you guys think?