nostr:npub1t5gkq629zym0lec7mwx923wvg2eu6jhvpjrndfy4vq5freqqezqq48n2s9 This is fascinating, thank you for sharing. Even if it's 10 seconds today, that clearly will fall over time.
If we wanted an Alexa-like product, I like the idea of a local base model to handle all speech-to-text requests. Out of the box, it would likely be quite bad but if we could train it with a long series of home focused queries (in the form of docs?), that would be interesting. I wonder how hard it is to make it (or likely improve it) to handle a new topic?