Persistance of Embeddings

In the last notebook Lesson 4: Building a Multi-Document Agent
the notebook is downloading several papers and parsing them into a dictionary of tools.

This is the use case when we are doing this the first time. If later I will like to chat with those 11 papers again, I don’t want to download and parse them again.

What is the best approach to persist this information so that I can load all papers parsed?

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Cell [3] of the notebook uses get_doc_tools to load the info and returns the vector_tool and summary_tool for that document and added to paper_to_tools_dict[paper]

Should I find a way to save this dictionary somewhere, like a binary file (like using Pickle or something like that?

Any input on this is highly appreciated.

J