Oddbean new post about | logout
 Sure, but that's only for new notes because you cannot query the database for historic notes using a BF. 
 Relay would run through all pubkeys and check if they are inside the bloom filter to get the set of keys I am interested about and then running the usual query. 
 Oh I see, it is used as a compression mechanism then. It can work. Thanks for clarifying. 
 You can use a Golomb-Rice encoded list then. It can encode all the PKs in a compact way and the relay can expand it without needing to use any other data source. It doesn't have errors and it is less cpu intensive, specially for the relay. Take it into account.  
 Isn't that in the same ballpark of what zip/zlib compressions alredy do? 

The Websocket connection can be compressed using gzip, etc. So, whatever we do needs to offer a significant again when compared to a an already compressed representation of the keys in hex strings.  
 Not sure how well gzip will compress filters, most cryptographic bytes do not compress well as they are usually high entropy. 
 Zip won't help much when compressing lists of pubkeys.

Even for a mobile client, uploading 67kB is not much, especially compared to what relays send in reply. If you want to make your client more responsive on startup, maybe you can query frequent posters first, to get something on the screen and send the full filter later.

If you consider involving the relay ... maybe we can reference collections of pubkeys via list events? The relay would get the relevant list event via it's naddr and then expand that list into the original query. Instead of {"authors":["c88821bdafd6c608107c9c4969672b57c3a8c85f2a0f6bc7b749c3c5609b3424"]}, {"authorLists":["naddr1qvzqqqr4xypzq3huhccxt6h34eupz3jeynjgjgek8lel2f4adaea0svyk94a3njdqy88wumn8ghj7mn0wvhxcmmv9uq3uamnwvaz7tmwdaehgu3dwp6kytnhv4kxcmmjv3jhytnwv46z7qqlgd6hyun9de6zq3n0d3kx7aeqf35hxapvyqcnzte3xqhnyvpjxvx3tr5w",...]} 
 Very costly for the server then. But I think there are DBs that support probabilistic filters. Cuckoo filters maybe. Redis appears to support Cuckoo and Bloom.

On the other hand, a 1/10000 FPP would give you the events of 1/10000 users which hopefully will be a lot.

Yet on the other hand, these filters are typically used with much lower FPP. 
 Aren't bloom filters faster than string comparisons? Maybe they aren't faster than pre-computed indexes?

Also, I assume relays would pre-compute and store the set of hashes needed for a standard Nostr Bloom filter.  
 Bloom filters use sort of string comparisons. They use bitmask comparisons. I'm not sure how an index could work for such a bitmask but apparently redis does something there.

Pre-computing the hashes would make things super expensive again. In your example of a 67kB Bloom filter, pubkeys extended  with hashes would be 67kB each, too.

I tried to do this myself but the best I got was to use probabilistic filters to save memory for active filters, not for querying the DB and thus not for the communications neither. The nice part is that you can mix event IDs and authors in the same filter.

https://github.com/Giszmo/NostrPostr/blob/master/nostrpostrlib/src/main/java/nostr/postr/Filter.kt#L195 
 Leo, I need to define a bloom/cuckoo filter format for pubkeys so that implementation A creates the filter that implementation B will use. Do you have any recommendations on how to encode the resulting filter itself? 

The filter will be saved in every event as a tag, so it must be something the other client can easily decode and apply.

Bonus if the writer of the filter can specify the precision. Any tips? 
 I'm not sure what you want to achieve exactly but bloom filters are really trivial in how they work and thus easy to serialize. Now as you are talking about use across implementations, you would need some serialization standard or come up with one. I know there are libraries that allow serialization but apparently the serializations they consume or produce are not compatible from one library to the other.

If you have a compelling use case for a nip for example, I bet we can come up with a standard for nostr. Are you working on something?

With regards to your original question, I think it's easier and more efficient to maybe tag lists in queries so you can refer to these tags in subsequent queries as 67kB big data structures aren't great to keep around with every pubkey for example and neither are 67kB great to index. 
 Trying to spec one out here https://github.com/nostr-protocol/nips/pull/1497 
 21:56 JJ7TIX/7 on JP-0110(Zao Quasi-National Park, JP-MG,JP-YT) 7004 CW cw[JJ7TIX/7] 
 Summary of Blackwell and Klukowski: States Should Pass Proof of Citizenship Laws to Register to Vote
nostr:nevent1qqsvy5ejvl3366k5glrz99pqn0jv5ugm55aslgdyktnthv8jhmp33uqzyrcs35mdw74l9rxpcrf4r4ltpg27a0pwd7kjj7xpzuc636tv4g22gqcyqqqqqqgnp9ha9

The Supreme Court has ruled that states can require proof of citizenship to register to vote, and Republican governors and lawmakers should pass such election integrity measures ahead of Election Day. The court's decision suggests that requiring proof of citizenship to register to vote is allowed under federal law, and over 86% of Americans agree that noncitizens should not participate in U.S. elections. Republican governors should call special sessions to pass similar laws, which would strengthen voter confidence and election security. 
 I’m completely sober right now. My energy changed completely. Like laying on a bed smiling at a friend. 🤣 
 ぃちいさいやないの 
 社畜さんなんて読むんだそれは 
 vmess://eyJhZGQiOiAiMjAyLjc4LjE2Mi41IiwgImFpZCI6IDAsICJob3N0IjogInBlbmRhci5vbnRoZXdpZmkuY29tIiwgImlkIjogIjcxNmVkZWQ2LTIyMDEtNGRiZC05ZDYzLTE2MzhjOWU4ZTY3NyIsICJuZXQiOiAid3MiLCAicGF0aCI6ICIvQGZvcndhcmR2MnJheSIsICJwb3J0IjogNDQzLCAicHMiOiAiZ2l0aHViLmNvbS9mcmVlZnEgLSBcdTUzNzBcdTVlYTYgIDEwIiwgInRscyI6ICJ0bHMiLCAidHlwZSI6ICJhdXRvIiwgInNlY3VyaXR5IjogImF1dG8iLCAic2tpcC1jZXJ0LXZlcmlmeSI6IHRydWUsICJzbmkiOiAiIn0= 
  ✅ EtherFi Airdrop Is Live!. 

 👉 https://telegra.ph/EtherFi-06-23 Claim your free $ETHFI.