Wow this is incredible! It'd be cool if LM Studio added support for this. You could rely on cloud-based inference when you have internet and want lightning-fast responses, but fallback to local inference using the same model if you're offline or want to ask a question privately.