Oddbean

This is one of the most important developments in nostr. We getting rsync for nostr queries. Incredible work, thank you Doug! Time to start implementing this in nostrdb. nostr:note1f6ym3cfyjfzzf64f060aglcvzn2mr36z4g4ve5ep2uw6h6kcucts0t44hq

relay a has some set of notes relay b had another set of notes How do you sync these relays efficiently? Also from a client perspective: how do i query and download notes i don’t already have efficiently. No longer do we need to download tons of redundant data.

You wouldn't even bother. Random sample a smaller set and average for a quick answer that's good enough for the accuracy you want at the moment. If you want to know more exactly, take a larger data set and sample that based on how much time and resources you have. No?

Why not. That’s all they will ever be anyway. Everything is an estimate except all the information itself, which you can never have all in one place simultaneously in this case anyway, due to the decentralization that is growing rapidly. Think on it or what just my 2 sats.

That's what HORNET Storage Scionic Merkle DAGs are for. I'm building the first relay that will use it btw, we are aiming to complete it in 4-5 months (that's our milestone target and funding limit).

It's not capable of guaranteeing complete synchronisation of things. I'm even working on (in my part time) a filter plugin that quickly recognises whether a reply post references a whitelisted identity, by building filters out of posts made by the user - for the purpose of caching first level replies to posts so the relay has it immediately. I'm going to be using bloom filters to build a recogniser, not sure how the state is going to be stored but probably replaceable private events for the filter itself as a virtual user.

just forgot my second point: compact, addressable merkle trees let you guarantee full set membership discovery, assuming that the user didn't post them to a relay not known by the one trying to find it. But it can know the ID, and it can know the set of a given batch of them that have been bundled into a tree.

i use it regularly. it's an extension of cp/scp that does lots of stuff like scanning the directory tree and reading filesizes and datestamps and - i'm not sure, i don't think it includes generating hashes. IMO, filesystems have several advancements they need but we are still stuck in late 70s unix filesystem designs. A few technologies work with the base and take it further but sadly not enough. A number of things that people associate with Macintosh relate to this. Apple did a lot of effort with making better filesystems but sadly their user-oriented advances have not been widely made use of. It seems to me like, for example, that all files should have a hash in their record that can be quickly accessed. But no. All such systems only create mirror images of the filesystem, if at best, inside a database, no filesystem exists afaik, that keeps a current hash of the file consist.

I should just add also, there has been quite a lot of rigidity with this. A lot of "filetypes" actually have filesystems inside them. Hell, HTML files encode a tree. Hypertext links are like filesystem links across the network. The technology is still very immature. The benefits of extending the filesystem have yet to be realised, and I'm already 40 years in on this from my early experiences with simple tape recorder soud modulated recording systems, and better ways already existed but just have not been implemented, since at least 30 years. There isn't enough computer programmers, and partly this is because programming languages are overly complex and build systems unfriendly and labor intensive.