On-device generative machine learning models are coming to Macs. Apple managed to squeeze a large language model into a MacBook Pro M2, it runs 25 times faster and models 2x larger, by storing the models in flash memory, made possible on M chips. https://arxiv.org/pdf/2312.11514.pdf