Which model should I use for LLM application

xiazhaosjtu · January 3, 2024, 6:17pm

If i want to evaluate if two topics are similar by LLM, which model should I use? would bert be good enough or gpt or sequence to sequence model ?

Muhab_Abubaker · January 6, 2024, 11:42am

I would suggest using BERT. since GPT is a decoder-only model geared towards generative tasks in general. Sequence-to-sequence models cannot handle long sequences like transformer-based models, which can hinder your performance on very long text corpus (topics in your case)

BERT is an encoder-only transformer model, which means it utilizes bidirectional context, which improves its performance for fixed-size predictions (more commonly called a many-to-one task).

Your input would be in format of n*[topic 1 word emb] [SEP emb] m*[topic_2 word emb] [EXTRACT emb] (n and m are the lengths of topic1 and topic 2 sequnces respectively), and you would make your prediction on the timestep of the extract token embedding (added at last)

Topic		Replies	Views
NER - Best Approach for Extracting Physical Dimensions from Logistics Texts: BERT or LLM? AI Discussions ai-discussions , project	1	208	April 13, 2024
Interface to interact with multiple LLMs AI Discussions llm , project	2	320	June 12, 2024
L3-Sentence_window_retrieval: what re-ranked model in case of "text-embedding-ada-002"? Building and Evaluating Advanced RAG Applications	0	210	January 5, 2024
Choosing the foundation model AI Discussions	0	56	September 17, 2023
Model Adaptation AI Discussions ai-discussions	2	84	January 25, 2023

Which model should I use for LLM application

Related topics