At the moment, it functions with llama.c via any terminal. Once it's quantized, you should be able to use it with mlc as well. I will also integrate it into Aithena. This will allow you to run it locally in your browser on any device without sacrificing performance.