Hello Everyone. I require help regarding a research project. Me and my collegeaue are trying to take both text and a related image as input (a question from a question paper) and the outpet has to be the level of Question - Remembering, application , comparison etc. My colleague says we should process the image and the text separately , fuse the features generated and then give it to a BiGRU network. I’m thinking we might need some LLM because it would make a better general application + it might give us the features directly from both the text and image. What are your thoughts ? We will be using Python.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Project Breakdown | 1 | 83 | December 8, 2023 | |
Seeking help with a project | 19 | 281 | July 14, 2024 | |
Project idea help | 1 | 50 | December 22, 2023 | |
Explain and discussing research papers, a group for beginners | 0 | 101 | March 27, 2024 | |
Guide for Machine Learning, NLP, Computer Vision - (Roadmap) Feedback | 11 | 541 | January 6, 2025 |