Tagged Articles
8 posts
A granular comparison of the software tools used to actually load and "chat" with your quantized model files.
An overview of how to transition from simply "talking" to your local model to connecting it to your personal data and local apps.
A guide to Retrieval-Augmented Generation (RAG), which allows your local model to search through your private PDFs, notes, and emails for instant answers.
A deeper dive into "Vision-Language Models" (VLMs) that allow you to ask your local AI questions about your personal photo library or screenshots.
The final "how-to" step: finding a model on Hugging Face, loading it into your software, and sending your first offline prompt.
A look at the "hidden" part of RAG: how tools like ChromaDB or lanceDB store your files as mathematical points so the AI can find them.
A technical look at the "shrinking" process (converting 16-bit files to 4-bit or 8-bit) that allows massive models to run on consumer-grade hardware.
An introduction to the benefits of running models on your own machine, from total data privacy to avoiding monthly subscription fees.