Fine tuning for health article Q&A

I believe it was mentioned that it’s best to fine tune a model using a base model that’s closer to the domain where’re adapting it to. If I wanted to adapt a model for a health content Q&A, how should I go about it? The content specifically is not clinical, but talks about health content - articles such as those from mayoclinic would be representative.

My thought is - I would explore models (pre-trained) that are generally trained, but also on health content. Then I would fine-tune it (using PEFT techniques) on more domain specific Q&A labeled training examples (in the order of 10-15k examples) specific to my use-case.

Is that on the right course or is there a better approach?

How can I discover datasets or pretrained models that are trained on specific types of content (e.g. mental health within “health” topic, or “heart health” within “health” topic, or “contract law” within “law” topic … etc)?

Thanks!

1 Like

By using this sentence I found this paper on google:

there should be other links too.

Seems right path to me, there can be other ways too! But seems ok to me…

1 Like