Tutorial - Tom Compagno

Choosing Your Runner – LM Studio vs. Ollama vs. Kobold

A granular comparison of the software tools used to actually load and "chat" with your quantized model files.

From Chatbot to Assistant – The Power of Integration

An overview of how to transition from simply "talking" to your local model to connecting it to your personal data and local apps.

Learn more

Local RAG – Teaching Your AI About Your Life

A guide to Retrieval-Augmented Generation (RAG), which allows your local model to search through your private PDFs, notes, and emails for instant answers.

Learn more

Multi-Modal RAG – Talking to Your Images and Videos

A deeper dive into "Vision-Language Models" (VLMs) that allow you to ask your local AI questions about your personal photo library or screenshots.

Learn more

The First Boot – Downloading and Running Your First GGUF

The final "how-to" step: finding a model on Hugging Face, loading it into your software, and sending your first offline prompt.

Learn more

The Vector Vault – Understanding Local Databases

A look at the "hidden" part of RAG: how tools like ChromaDB or lanceDB store your files as mathematical points so the AI can find them.

Learn more

Quantization – Fitting a Giant in a Small Box

A technical look at the "shrinking" process (converting 16-bit files to 4-bit or 8-bit) that allows massive models to run on consumer-grade hardware.

Learn more

Why Go Local? The Case for Private AI

An introduction to the benefits of running models on your own machine, from total data privacy to avoiding monthly subscription fees.

Learn more