I’m currently exploring the idea of creating a Neo4j graph database from unstructured data, leveraging the capabilities of GPT-4 to extract entities, relationships, and properties. However, I’m encountering challenges with the accuracy of relationships extracted and sometimes the structure of Cypher queries generated. Even after attempting few-shot prompts for better precision, the results are not always as expected. While considering a move to GPT-3.5 with fine-tuning for Cypher structures, I’m concerned about losing detail or introducing more noise into the data.
I’ve looked into services like Diffbot, but the cost seems a bit steep for my current budget. I’m reaching out to see if anyone has faced similar challenges and found effective solutions or workarounds. Specifically, I’m curious about:
- Techniques or best practices you’ve used to improve the accuracy of entity and relationship extraction with LLMs.
- Any success with fine-tuning strategies that preserve detail while ensuring the structure of Cypher queries remains intact.
- Alternative tools or services that might offer a balance between cost and effectiveness for extracting structured information from unstructured data.
Additionally, if there are any cost-effective methods or lesser-known tools that could serve as an alternative to high-cost services, I’d love to hear about those too.
Thanks in advance for your insights and advice. I’m looking forward to learning from your experiences and exploring new strategies to make this project a success!