How to run code of the course in local

Hi i dont want to use together.ai as im a teacher, and in the institute that i wok i cant make my students register as they are minors and laws of the country.

So is there any easy way to follow the course without together.ai ? The used helper function is defined :

def llama(prompt, 
          add_inst=True, 
          model="togethercomputer/llama-2-7b-chat", 
          temperature=0.0, 
          max_tokens=1024,
          verbose=False,
          url=url,
          headers=headers,
          base = 2, # number of seconds to wait
          max_tries=3):

I guess is there a way to use it with local llama or openai python library:


from openai import OpenAI
prompt = "What is the capital of France?"

client = OpenAI(
    base_url='http://localhost:11434/v1/',

    # required but ignored
    api_key='ollama',
)
/* PERHAPS SOME CODE GOES HERE */

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': prompt,
        }
    ],
    model='llama2',

 /* WHAT CODE GOES HERE, PERHAPS SOME PARAMS ??? */
)

print(chat_completion.choices[0].message.content)

Sorry for my bad english, but i want or need help to “translate” the llama function defined above ( the full code is in utils.py) into some kind of code i could use in local. If using openai compatibility would be great, if not with “nomal” ollama python library.

Thanks in advance

I updated because i made some code that seems to work, any ideas or improvements would be apreciated:

Think should be executed once:

import os
client = OpenAI(
    base_url='http://localhost:11434/v1/',

    # required but ignored
    api_key='ollama',
)

And the definition of my helper function:
Note: llama2 is the name of the model you are excuting in local, in my casse:
$ollama run llama2

def llama_openai(prompt, 
          add_inst=True, 
          model="llama2", 
          temperature=0.0, #By default in openai is 1.0
          max_tokens=1024,
          verbose=False
         ):
    
    if add_inst:
        prompt = f"[INST]{prompt}[/INST]"

    if verbose:
        print(f"Prompt:\n{prompt}\n")
        print(f"model: {model}")

    response = client.chat.completions.create(
            messages=[
                        {
                        'role': 'user',
                        'content': prompt,
                        }
                    ],
            model=model,
            max_tokens=max_tokens,
            temperature=temperature
                )

    #print(f"response object: {response}")
           
    return response

prompt = "Help me write a birthday card for my dear friend Andrew."
response = llama_openai(prompt, verbose=True)
print(response)

Response:

Prompt:
[INST]Help me write a birthday card for my dear friend Andrew.[/INST]

model: llama2
ChatCompletion(id='chatcmpl-481', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Of course, I'd be happy to help you write a birthday card for your friend Andrew! Here are some suggestions:\n\nDear Andrew,\n\nHappy birthday to an amazing friend like you! 🎉 Today is all about celebrating the incredible person you are and the many adventures we've shared together. From late-night conversations to spontaneous road trips, every moment with you has been a gift.\n\nI hope your birthday is filled with laughter, love, and all your favorite things. You deserve to be pampered and spoiled rotten on this special day. May it be a reminder of how much you're appreciated and loved by those around you.\n\nHere's to another year of making memories and creating new ones together! Cheers, my dear friend! 🥂\n\nWishing you a birthday as bright and beautiful as you are,\n[Your Name]", role='assistant', function_call=None, tool_calls=None))], created=1710276555, model='llama2', object='chat.completion', system_fingerprint='fp_ollama', usage=CompletionUsage(completion_tokens=208, prompt_tokens=39, total_tokens=247))

Another test:

### base model
prompt = "What is the capital of France?"
response = llama_openai(prompt, 
                 verbose=True,
                 add_inst=False,
                 model="llama2")
print(response.choices[0].message.content)

Response:

Prompt:
What is the capital of France?

model: llama2

The capital of France is Paris.

I have a similar question on how we modify the utils.py in the llama course to run a local llama3 and still get the tutorials to work. I get the llama3 model to run fine, but not sure on what to change in the utils.py file, though I can get api calls outside of the course to work. It’s a shame they simply didn’t include that code as an example.

Hello @vinodralh

Were you able to get utils.py file? You could find by clicking file===open.

The API key part changes, in case you have your own api keys from openai, then you can make changes in the utils.py and then run code successfully in your local environment, but remember to run all the codes might be required all the necessary packages to be present in your local system and also ensuring the compatibility with version of your python and the packages you need to download have compatible version.

Regards
DP