It's one way to do it, but sort of centralized because it assumes you're running a cache for the whole network. I can't imagine it would be easy to compose multiple with coverage over different parts of the network. A better interface that allows you to ad hoc query particular relays, or compose the results from multiple caches is what's needed.
I don’t think it requires the cache being global, although that’s the approach primal went with. I’m imaging a second process running near your relay that’s responsible for projecting the current state eg revision of a note. this second layer could expose the same api as a standard nostr relay to make clients just work. you could also suck this second layer logic into your relay implementation depending on your scale/reliability/ops requirements.
in a large scale distributed system analogy, new events would hit a queue. one consumer group would write the raw events to storage. another consumer group would update the projected state in the second layer when applicable.
Yeah, I think I agree, the query interface just needs to be carefully designed so that multiple caches can be queried simultaneously and reconciled. See the discussion around COUNT.