Oddbean

If you’re thinking about running an LLM at home for the first time, here’s my top 4 tips: 1. Try GPT4ALL and/or ollama. These are launchers that help you download and interact with models. GPT4ALL is a GUI, while ollama is a command line program. 2. Current models come in roughly two sizes: 7B and 22B parameters. These are ~4GB and ~40GB respectively, but they can be even bigger. If your GPU has computational capability AND sufficient vRAM, then the models can be run on GPU. If not, they’ll run on CPU, but more slowly. Try a 4GB model to start. 3. Although there are a relatively small number of popular architectures (llama, mistral, etc.), there are lots of variants of models to choose from. Hugging Face (terrible name) is the site to browse for models. 4. “Alignment” is the new word for bias (particularly the philosophical/political kind). A model tweaked to be maximally compliant and unbiased is called “unaligned”. The big mainstream models are “aligned” with the companies that produced them. Find unaligned models if you want less biased results. (I’ve been happy with the Dolphin line of models by Cognitive Computations). Good luck!