I don’t think it requires the cache being global, although that’s the approach primal went with. I’m imaging a second process running near your relay that’s responsible for projecting the current state eg revision of a note. this second layer could expose the same api as a standard nostr relay to make clients just work. you could also suck this second layer logic into your relay implementation depending on your scale/reliability/ops requirements.