Oddbean

It seems obvious to me that silicon will continue to improve to be faster with much more storage. But what does this mean? Well for starters, Whatever #ChatGPT evolves into, will likely be able to run LOCALLY on the equivalent of my 'phone' in <10 years (or my "home server") In fact, many things we assume to be 'cloud only' will be easily "local only" with this shift. This doesn't mean "cloud" is dead but it will be a huge shift. #UX #LLM #TECH

@957492b3 I agree with you that it will be able to run locally but I don't think it will. Not because it can't but because large tech companies prefer to own, and have access to, our data. If the data is stored locally it's harder to monetize.

@957492b3 not sure about this; while the model may be able to run locally, i suspect it will be live querying a range of data bases as needed to augment its output and achieve its tasks, and these will be proprietary. local is only really local if you could take it back in time and run it…

As a few replies have asked, I don't mean "never cloud" only that it's possible to run "mostly local" this seems transformative (and already in the direction that Siri and Google are headed) Just a simple fall back, this would be very helpful. I'm just saying that as privacy concerns grown (and alternatives appear) the ability to be *mostly* local strikes me as a rather disruptive position. Products like @cbff34d4 are well positioned to take advantage of this.

@957492b3 I’m especially interested in the ‘personal assistant’ concept — if that’s even possible with this generation of models — because the detail of personal information users would necessarily need to share for the ‘assistant’ to be really useful (as useful as a trusted human, for example) will inevitably raise privacy issues.

@957492b3 @9589bde7 People are running LLMs locally. You might be interested to look at https://python.langchain.com/docs/guides/local_llms Useful Sensors (founded by some of the DeepMind folks) are working on specialised boxes - running OSS LLMs on low-cost, low-power hardware with strong data privacy. https://usefulsensors.com/ Founder/CEO Pete Warden recently wrote about the economics of it, framing it as an inevitable shift from training to inference: https://petewarden.com/2023/09/10/why-nvidias-ai-supremacy-is-only-temporary/ #localLLM #LLM #UX h/t @27e3ee9e

@957492b3 Not only can you run “local LLMs”, you can use your store of documents to “tune” the models. Been evaluating OSS LlamaCpp written in C++, GPT4All packaged as part of PrivateGPT which runs on M1 Macbook CPUs/GPUs - https://github.com/imartinez/privateGPT. Working on performance as the responses can be anywhere from 1s to 7s.