I was attempting that, and it's quite challenging. So, I will be working on it with Gemini. This will focus on private tasks like TTS, PDF summarizer, Basic Q/A etc., using small, task-specific language models with WebGPU.
Would be nice if we can set the address of an OpenAI API compatible LLM to give users choice