Generating Synthetic Dataset Using LLM

I would i like know if anyone know how to generate the synthetic dataset using the LLM By taking the training from the dataset which we already have in the csv format which is very huge dataset, i wanted to know like which LLM is best for this or any guidance from you will be very helpful for me. Thanking you in Advance.

You could use Generative ai, data augmentation, masking the data, cloning, random sampling.

one of the famous one is hugging face

You can also LangChain

is there any course at deeplearning.ai that focuses on synthetic data creation?

can I know is it in pretext to general data type or any specific data type, image/text/combination? can you be more brief about what you are looking for!

basically i am working on Trade promotions module with a partner. we have some sample data - it is structured data identifying differnt coupons, discouts strategies like buy 1 get 1 free etc by date for past 10 years etc. But it also has the price it was sold for, the mark ups, seasonal holiday sales volumes etc and SKU for product id etc.
so if we have few hundred records we want to create like a millions of them over 10 yrs etc.
we also need to be mindful of not randomizing but maintain some relationships in the randomization.
Which then we want to train our model on this data.

So wanted to explore creating synthetic data based on some limited real structured data.

thanks.!

hi there i am new, do you also have troubles with chatbot? he doesn’t work
in the learning platform

hi @rds

avoid comment on post that are created by others. Always try to create a new topic for your query for the less confusion of other learners as well as your’s. Also if your chatbot issue is related to ai for python beginner, then report in selective category when you create your new topic, so course relative mentors or staff can guide you further.

Regards
DP