Are there any Manga and anime fan out here?

Hi everyone, I want to finetune a large language model for “Question-Answering” in otakus’ universe. People don’t really have this culture in Mali and I’m pretty sure it will be impossible for me to get my QA pairs here. I’m also thinking about how can I collect those data with a minimum effort and cleaning so I thought of creating a google spreadsheet to collect a lot of QA pairs in the simplest way but I need otakus to fill it out. Is anyone here interested in helping me collect those data?
And if someone has a better idea for data collection than using a shared spreadsheet Please let me know!

3 Likes

Have you seen this ?

3 Likes

That creative

2 Likes

Sure, my first idea was to search a dataset on kaggle but it turns out that those are mainly datasets for recommender systems. Nothing about text, QA, nothing to train an LLM

1 Like

Yes, but I’m struggling to collect data!

1 Like

Why can’t you turn details about the dataset into a QA dataset?

It would require too much time, and I wanted to get QA pairs from real otakus, because those are the ones the model would be trained for

1 Like

How about the following approaches for generating Q&A pairs:

  1. Provide few shot examples to an LLM and make it generate responses for new content.
  2. Use a crowd sourcing / freelance platform.
1 Like

I already thought about the first one, but I didn’t want my data to contains such a pattern. I want data from real manga/anime fan, which is more likely to be unbiased, pertinent and quality data.
And for you second idea, you mean to delegate the task to freelancers that I would pay to collect the data?

1 Like

Yes. Have you heard of turk ?

No I had never heard of mturk before you, looks interesting! Thank you! :thanks:

1 Like

Seems interesting, might add if got something helpful!:wink:

2 Likes

Hi i aminterested in helping you out

1 Like

Thank you for your interest and excuse the late answer. Here is the Google form I’m using to collect data. Appreciate your help! Thanks :hugs: