Which model should I use for LLM application

If i want to evaluate if two topics are similar by LLM, which model should I use? would bert be good enough or gpt or sequence to sequence model ?

I would suggest using BERT. since GPT is a decoder-only model geared towards generative tasks in general. Sequence-to-sequence models cannot handle long sequences like transformer-based models, which can hinder your performance on very long text corpus (topics in your case)

BERT is an encoder-only transformer model, which means it utilizes bidirectional context, which improves its performance for fixed-size predictions (more commonly called a many-to-one task).

Your input would be in format of n*[topic 1 word emb] [SEP emb] m*[topic_2 word emb] [EXTRACT emb] (n and m are the lengths of topic1 and topic 2 sequnces respectively), and you would make your prediction on the timestep of the extract token embedding (added at last)

1 Like