Oddbean

▲ ▼

 Meta released MobileLLM models in 125M, 350M, 600M, and 1B parameter sizes.

their 1B Llama model performs well on phones, even mid-range devices. These one are specifically designed and optimized for mobile hardware.

Works with llama.c
https://huggingface.co/collections/facebook/mobilellm-6722be18cb86c20ebe113e95

▲ ▼

 How do you run them?

▲ ▼

 At the moment, it functions with llama.c via any terminal. Once it's quantized, you should be able to use it with mlc as well. 

I will also integrate it into Aithena. This will allow you to run it locally in your browser on any device without sacrificing performance.

▲ ▼

 Right now, you can try other model, run them locally on any device. When I say 'other,' I mean pretty much any open-source model. 

In the settings, you have the option to choose a model. Once selected, it will be downloaded and cached for future use.
https://aitheena.vercel.app/