Model Adaptation

Hello, the aim of this discussion is to share ideas. I would like to let’s say adapt some model for differents tasks(summarization, translation,…) while trying to find out how it works(evaluation part) by digging through the model or even the tool(e.g. which layer makes which decision that affect the model decision).
Anyone have idea to share about this. Which models are suitable and how can this be done.

Hi @cirediallo ,

To start the conversation, since your objectives are summarization and translation among others, then you are looking potentially at sequential models.

Some examples of the classic sequential models are RNN, LSTM, GRU.

But the prima-donna models for tasks like summarization and translation right now are Transformers, which started in 2017 with the “Attention is all you need” paper. And the most notorious example of this right now is ChatGPT.

Thank you for your answer. In week 4 of NLP Specialization I saw models like BERT or T5 that can be fine-tuned for downstream task but they don’t do ASR.
I saw on a paper that that we can add acoustic layer to BERT to make it do ASR but don’t know for T5. Another aspect to take into account is that the hosting cost of the model(in which I don’t know much things).
The task that I want to do first is to collect information and plan a strategy for development(I don’t know if it’s possible but have one pipeline for dowstream tasks)