Would it be possible to share the code used for data cleaning of the original XML files?

Yes, I have the same request. It will be great

I too have same request to make SEC data cleaning scripts available


It isn’t pretty, but the data prep project that we used can be found by googling for “github neo4j-product-examples data-prep-sec-edgar”. Apologies for being indirect, I have not yet earned enough karma here to post links.




I too have the same request would love to get the code which they used for cleaning and generating the json files from xml . Did you get any luck with it?

@PrasannaPrakash , @A_Stein , @Kushwanth , @Subham1906 , Here’s the Github link that @akollegger so kindly mentioned in his post above. You’ll find the readme and all necessary info:

Thanks very much, @akollegger for making this available. You deserve a better karma score :wink:

