Main image of article Run Smarter, Keep Your AI Local

Running an AI locally isn't just a fun experiment — it's a genuinely useful skill for any tech-curious professional. Your data stays on your machine, the model runs on your hardware, and there's no subscription fee ticking in the background.  But what does it actually look like in practice?

There are several methods out there to choose from.

  1. Add AI capability to your software. It talks to a remote AI LLM and gets results back. You can also do this with Copilot and Excel or Word assuming you have an AI license and Microsoft 365 subscription.
  2. Run an LLM on your PC using your GPU.
  3. Run an AI agent on your PC like OpenClaw that communicates over messaging platforms with remote LLMs.

This guide walks you through the steps for option #2, but rather than just running AI, lets it through its paces with a real task: reading every Charles Dickens novel ever written and answering questions about them.

I started by obtaining a text file of all of Dickens’ works; a quick search found the Internet Archive had a page with them in multiple formats. I downloaded the text which included roughly 2,000 lines of HTML at the start and some at the end. I removed those in Notepad++ leaving just under 40,000 lines of text. This is provided for you in a zip file you can download here.

Setting up a RAG (Retrieval-Augmented Generation) system

This consists of document storage and chunking to break the Dickens text into pieces.  The Embeddings and vector database make it searchable and an LLM is used to answer questions using relevant chunks.

We’re going to be using Python; it’s the most popular programming language for RAG although others like Rust are starting to appear.

Initially I was going to run this on Ubuntu in a Hyper-V VM but browsing the canirun.ai website showed that it didn’t have access to the GPU, so I switched to Windows. This site also identifies your GPU and lists all the available LLM models each with a rating: Decent, Barely Runs, Too Heavy, or Runs Great.

Start by creating a folder. This is where the virtual environment will be created.

Setting up Python

The Python version I’d previously installed in Windows 11 was Python 3.9, but I read somewhere that 3.10 was the minimum I should use so I went for 3.14 which was the current version at the time; this was a mistake as some of the Python modules hadn’t been updated for 3.14, so I ultimately ended up using 3.11

First, I created a virtual environment and activated it, then upgraded pip to the latest version and downloaded the modules.

py -3.11 -m venv venv311
venv311\Scripts\activate
venv311\Scripts\python.exe -m pip install --upgrade pip
pip install langchain==0.1.20 langchain-community chromadb sentence-transformers ollama

This took long enough for a coffee break.

Pulling a model

This is the heart of the system, and there are a number of models to choose from. The right model to use depends on your GPU.

After picking your LLM on the canirun.ai website, install and use Ollama to fetch it. Models can vary in size between 1 GB to over 200 GB, with larger commercial ones being on the higher end. I went with Mistral 7B, one of the lighter models with a 4.4 GB download, so time for another coffee break:

ollama pull mistral

Where do I put the Dickens text?

Get the dickens.zip file from GitHub. Extract rag.py into the main folder and dickens.txt into a data folder beneath the main folder.

All of the heavy lifting will be done by rag.py. On the first run it loads the text, handles the embeddings and after it’s all loaded runs in a loop where you enter questions. It’s about 100 lines long.

Run rag.py with:

python3 rag.py

I tried it with this question: Compare Pip and David Copperfield

It answered:

Both Charles Dickens' characters, Pip and David Copperfield, undergo significant personal growth throughout their respective novels, "Great Expectations" and "David Copperfield." Although they share a common theme of transformation, their backgrounds, aspirations, and relationships differ significantly.

Pip, initially an orphan living with his sister and her husband, is often influenced by the people he encounters in his rural Kent surroundings. He aspires to rise above his humble origins and become a gentleman, influenced by the mysterious Estella and the convict Magwitch. Pip's journey involves learning about self-worth, integrity, and the consequences of one's actions.

David Copperfield, on the other hand, comes from a more middle-class background with both parents present. His story takes him through various professional pursuits before settling into a successful career as a writer. David experiences challenges in his relationships with family members, such as Mr. and Mrs. Murdstone, and friends, like Steerforth. However, he ultimately learns the importance of empathy, resilience, and the value of true friendships, represented by his relationships with Agnes Wickfield, Peggotty, and his own daughter, Emma.

In conclusion, while both characters experience personal growth through their trials and tribulations, Pip and David Copperfield have distinct backgrounds, aspirations, and relationships that shape their unique stories. They provide insight into different aspects of human development, showing readers the complexities and challenges of growing up in Victorian England.

Conclusion

Given this was running on four-year old PC, I was impressed with the response time, which was just a few seconds.  The Mistral LLM has 7-7.3 billion parameters, making it a small to medium sized LLM that’s good at summarizing text and answering questions.

If you want to try another model, say llama3:

  1. ollama pull llama3
  2. Edit rag.py and change line 78 to
    llm = Ollama(model="llama3")

Compared to mistral, llama3 has 8 billion parameters, is slightly slower but offers better quality such as stronger reasoning and better at following instructions. Llama 3 is better for literary analysis, and comparing characters.  Try the “Compare Pip and David Copperfield” question with llama 3 to see the difference. Here's what it returned for me:

  Pip is the protagonist of Great Expectations (1861), while David Copperfield is the narrator and main character of The Personal History of David Copperfield (1849-1850). Both characters share some similarities:

 * Both are orphaned or have lost their original families: Pip's parents died, and he was raised by his abusive sister and her husband; David Copperfield's mother dies when he is young, and he is sent away to a boarding school.

* Both experience significant life changes: Pip inherits a fortune and becomes a gentleman, while David Copperfield faces various challenges, including the cruelty of his stepfather and the loss of his beloved friend, Steerforth.

* Both have complex relationships with adults who shape their lives: Pip's interactions with Miss Havisham, Magwitch, and Estella significantly influence his development; David Copperfield's relationships with Mr. Micawber, Uriah Heep, and Agnes Wickfield have a profound impact on him.

 However, there are also differences between the two characters:

 * Pip is more introverted and passive, while David Copperfield is more outgoing and assertive.

* Pip's experiences are more focused on class and social status, whereas David Copperfield's story explores themes of childhood, friendship, and the consequences of one's actions.

 Overall, both Pip and David Copperfield are well-developed characters who undergo significant growth and development throughout their respective novels.

By now you've pulled a model, built a RAG system, and asked an AI to compare Victorian literary protagonists — all from your own PC. Not bad for an afternoon's work and a couple of coffee breaks.

The rabbit hole goes deeper from here. Swap in a different model, point the pipeline at your own documents, or experiment with chunk sizes to improve retrieval accuracy. The setup is the same; only the data changes.

And if it all goes sideways? At least the coffee was good.