nostr:npub1j46f9va5qdrmamcwmncew7g8mwekgma386dhws6k04fsnlkqpcpsj23gm7 Not only can you run “local LLMs”, you can use your store of documents to “tune” the models. Been evaluating OSS LlamaCpp written in C++, GPT4All packaged as part of PrivateGPT which runs on M1 Macbook CPUs/GPUs - https://github.com/imartinez/privateGPT. Working on performance as the responses can be anywhere from 1s to 7s.