LLM Security/Risk Concerns


Thank you for this great community. There are some security/risk concerns with respect to the LLM deployments as the information provided by the user leaves the boundaries of the company. This can hinder/stop the deployment of LLM systems. Is there a solution to this? Is there a localized version that can reside solely within the boundaries of the companies and that can have a good performance as ChatGPT?

Thank you.

Hi @mansourshams , your concerns are valid: the data leaves your premises and at that point you lose control.

There are alternative.

There are multiple models (Bloom, Falcon, Llama, to name a few) that can be hosted inside your premises. These are smaller models or base models, so don’t expect the same performance that you get from the get go with ChatGPT or the GPT family of products.

These models need to be fine-tune to the tasks you need them for. As you may have seen or will see in the new LLM course, this can take different levels of resources and expertise.

Can you get a model to perform as well as ChatGPT? In my opinion, very hard. The good news is that OpenAI has announced a “personal assistant” coming up in the next few months and, as I remember reading, this product may tackle your privacy concerns. So lets wait and see.

Hello Juan, Thank you for your reply. I have a large number/amount of technical documents to train the model with, however, I would like the model to have a good amount of literary knowledge (say at the level of a college graduate) in order to understand/discover in-between-lines. Any suggestions for that? I have seen some Tensorflow implementations that download large set-up files (up to 42GB) and I guess they are trying to mimic ChatGpt. Also, which one of these localized models have a Python interface?

Thank you

Have you tried looking for trainable models in Huggingface? I think that this could be the best source of information for your project.

In Huggingface you’ll find a good number of models that can be downloaded, along with the libraries to interact with them in python.

My experience, besides GPT and Anthropic, is limited to Bloom and Falcon, and now to the model introduced in this course: FLAN. But there are many more. One that has been highlighted is FALCON. Another set of models are those by MOSAIC.