been saying this for about 3 months, there needs to be a field in the filter to exclude event IDs from being sent back.. or a bloom filter for excludes, whichever thing you think is more expensive, bloom filter match or the space for the exclude event ID array anyhow, y'all gonna figure it out one way or another, i've worked out that none of you pay that much attention to what i say, and i get to play Cassandra in this show
In principle I can see the merit. But in practice I see problems. Sending huge numbers of IDs to exclude is itself expensive. Bloom filters aren't precise. Negentropy (the other solution often suggested) requires potentially a lot of round trips. There are possibly other solutions like getting all the IDs first and then asking for only some of those events. Also 256-bit IDs are cryptographically unique but we don't need cryptographic uniqueness for this avoid-double-event problem, just practical uniqueness, so using the first 64 bits of an ID to make it shorter is a reasonable optimization to whichever solution arises as the best. I haven't heard anybody make an argument that compares the options and concludes as to which option is superior, so we are still in an exploratory stage as to this issue. But I don't deny the problem.
i think we are also in a phase where we are growing out of the idea of "dumb relay" and into the "modular relay that has multiple services" phase there really is nothing stopping someone making a neat GraphQL query API interface to a relay database, it's really just a question of writing queries and creating indexes