Download gguf quantizations of the models (q4 minimum) and use software like MLCchat or Layla.ai. I'm not sure what software is available on iPhone, personally.