Oddbean

You can use Ollama for inferring some playground models. It can run on both CPU and GPU. Python is required. Learn how to create the dataset (in ChatML format or completion format, for example). For fine tuning, if you have Nvidia GPU(s), try using Unsloth. As for me, I'm using my MacBook with the MLX framework (built for Apple Silicon) to fine-tune and infer my LLMs in one place. It's very easy to start with self-hosted solutions.

Training will be the tougher part of this. There’s lots of data preparation in that task, but there are ways to fine-tune and just use a file or some text as a reference that will help in giving a quick view of what it would look like. For running LLMs locally, I use LMStudio, Ollama, and GPT4All most. For hardware, basically get anything that lets you customize your GPU and get one with a good chunk of VRAM. The unified memory in the new Macs is actually pretty great too, but not all the tools make use of it yet (most do). As far as books, I would actually do two that I’ve read in Ai Unchained: • A Gentle Introduction Into Large Language Models • The Bitcoin and Ai Industry Report Both of these are great foundational pieces to get a big picture view of the task and various ways you can go about it. Also I would check out an embedding model like snowflake that you can run locally. This lets you take some text or data and vectorize it to be used easily by an LLM. Makes a great tool for “asking questions to your data and files” and also for fine tuning or context search on your own machine. It’s not something I’ve explored very deeply, but I’ve started tinkering. Lastly, I would check out LangChain, and also maybe our Devs Who Can’t Code series, just so you can get the mindset for how to use LLMs to build applications. This process and the tools are getting better by the day, so keep an eye out. I’ll try to keep up with it and let you know when something drops to materially change what I’m able to accomplish. 👍🏻🫡