How to create a dataset from the excel or pdf files and fine tune the LLM for a specific task

I want to create an custom LLM using the excel and pdf files i have so how can i create the dataset for that and how to create the LLM from scratch for a specific task like data generator.Also looking for options for fine tuning the base models.

Hi @JigarB,

This is a very broad question and requires a full course to answer it. My advise for you is to check and learn either LangChain or LlamaIndex on how to build LLMs using RAG techniques (Many of the short courses on deeplearning.ai cover these topics). These frameworks allow you to utilize uploaders for pdf files and excel files to ground your LLM model in the uploaded data.

I hope that helps.
Samuel

1 Like