Experience installing and running GPT-2 on a MacBook Air

ai_curious · August 19, 2025, 11:33pm

I took the Deep Learning Specialization in the Fall of 2017. At the time, my interest was very much in NLP and I was interested to learn about the SOTA sequence model approaches. Turns out, the Sequence Model course wasn’t actually released yet. Instead I ended up spending way too much time on a deep dive into CNNs, particularly in the context of YOLO. I dragged my feet digging back into NLP, especially after some of the newest innovations were being released in PyTorch. I’m 67 and can’t even list all of the programming languages and platforms I have learned over the years; didn’t want to learn another.

That changed this week. I got a new, entry level MacBook Air, and tinkered around getting Python, Tensorflow and Keras installed. Tried to run some of my old NLP materials but couldn’t get an environment stood up that could access TF datasets. Broke down and tried PyTorch. Now glad I did.

First, I found a really nice notebook on GPT on Github.

Maybe you know already, but none of the OpenAI GPT code is open source. And the latest models, to the best of my knowledge, don’t even have a published architecture. GPT-2 doesn’t have open source code, but the architecture is known. Using the pattern from the notebook, I built my own GPT class

which implements the GPT-2 architecture

Once the layers are defined and an instance created, I downloaded a pre-trained GPT-2 model from HuggingFace, and copied over the weights

I don’t have access to OpenAI’s GPT class code, but I do have access to the locally defined class code. So I went in and made a few hacks, basically instrumenting the inference loops to force some output after the model.generate() is invoked. Here’s the call to generate()

The text prompt was “What is Machine Learning?”, You can see the plain text and encoded versions of it. Below is the output from the iterations as it processes the prompt and generates the response. I added a block to print out the top 5 token candidates and their probabilities. Only the top candidate is included in the response. NOTE: for people who are asking whether or not OpenAI GPT is sentient, here is the answer. NO.

Finally, the last iteration and the generated response

So now I have a pretty decent understanding of what GPT is doing under the covers, because I can see the iterations and where the candidate ranking and selections is occurring. Since I don’t have the GPT-2 corpus, I can’t train what I have, and honestly I couldn’t do it on my toy computer anyway. But I might be able to do a little fine-tuning, which is the next thing on my list to try.

Feel free to ask questions about what I did (I added a few things and also had to edit some of the GitHub code to get it to run in my environment) and check out the GitHub repository shown in the top of the notebook…lots of interesting topics there.

Cheers

TMosh · August 19, 2025, 11:37pm

That is a nice piece of work.

ai_curious · August 19, 2025, 11:50pm

Thanks. The folks at National University of Singapore did the heavy lifting. I did little more than add some printf

Upon further investigation I realize that the code I stole, eh, reused, is almost directly from Andrej Karpathy. You can find the original here

TMosh · August 20, 2025, 1:55am

You were also motivated to go look for it, and you shared your results.

Topic		Replies	Views
A tribute to DeepLearning.AI and the NLP specialization team & My fine-tuned GPT-3 Demo NLP with Attention Models week-module-1	3	633	April 29, 2023
ChatBot with LLama2 Building Systems with the ChatGPT API	10	441	January 31, 2024
What AI can not do AI For Everyone week-module-1	7	755	April 13, 2023
Install chatgpt python library ChatGPT Prompt Engineering for Developers	3	231	May 21, 2023
Works properly on colab but not working on my local Machine Natural Language Processing in TensorFlow	1	382	November 10, 2021

Experience installing and running GPT-2 on a MacBook Air

Related topics