I want to convert the Extractive QA task as a Reinforcement Learning Problem Statement. So I want to integrate NLP problem into Reinforcement Learning and see if my results were achieving better when compared to only NLP model and see why it is performing better or why it is performing worse
I have a set of legal contracts and I want to use an extractive QA task to extract relevant information such as payment terms and insurance. I have tokenized the data and have a Hugging Face transformer model for this. My ultimate goal is to use a DQN where the transformer model serves as the base model.
To achieve this, I need to convert the tokenized data into an RL environment and train the combined model on a group of contract environments. However, I am currently facing difficulties in creating the environment for DQN.
I would greatly appreciate it if you could offer any advice or guidance on how I can proceed with this task. I am eager to learn and am open to any suggestions you may have.
Thank you very much for your time and consideration.