Oddbean

▲ ▼

 @957492b3 

Not only can you run “local LLMs”, you can use your store of documents to “tune” the models.

Been evaluating OSS LlamaCpp written in C++, GPT4All packaged as part of PrivateGPT which runs on M1 Macbook CPUs/GPUs - https://github.com/imartinez/privateGPT.

Working on performance as the responses can be anywhere from 1s to 7s.

▲ ▼

 @5d116069 This is fascinating, thank you for sharing. Even if it's 10 seconds today, that clearly will fall over time. 

If we wanted an Alexa-like product, I like the idea of a local base model to handle all speech-to-text requests. Out of the box, it would likely be quite bad but if we could train it with a long series of home focused queries (in the form of docs?), that would be interesting. I wonder how hard it is to make it (or likely improve it) to handle a new topic?