Oddbean

nostr:nprofile1qyd8wumn8ghj7urewfsk66ty9enxjct5dfskvtnrdakj7qgmwaehxw309aex2mrp0yh8wetnw3jhymnzw33jucm0d5hsz9thwden5te0wfjkccte9ejxzmt4wvhxjme0qqsrhuxx8l9ex335q7he0f09aej04zpazpl0ne2cgukyawd24mayt8gfnma0u I re-broadcasted my note. To actually answer your question, probably the best solution would be a full in-place "update" event as a separate kind with an `e` tag pointing to the original "create" note. This way you you don't have to trace a chain of diffs, just look at the timestamp, and you get verb semantics. This would only be a problem if a blog post had a million revisions, like if a client spammed a live draft as revisions. 5-10 revisions is a lot for a blog post, and easy to process. I do still think there's a place for "annotations" that clients can display in a privileged position (the use case being updates at the bottom of a blog post, corrections, etc). Diffs are way more complex, and dependent on each other, but also probably unnecessary for blog posts.

I like this idea. I previously worked on graph systems and this is how we handled revisions- changes were stored in new nodes with edges back to the original. this model works well with event sourcing.

Imagine you're browsing a feed and your client fetching metadata for all those people on the fly. You can either fetch an event for each or you can fetch 20 events for each and reduce them to a metadata object by applying diffs locally. Do you really think the second is the best solution in the real world? Also, since you're just fetching random people's profiles you cannot ever be sure (or how could you be?) that their sequence of diffs is complete. You would actually need a blockchain to be sure.

We already can't make it work with replaceable events, imagine having to fetch an unreliable stream of events from an unspecified location. https://image.nostr.build/f7c47b6b2e14db8ef0416ff2285d4987dae2c01ba305f267fb4b23e908ec8eb7.jpg

I wouldn't use diffs, I would just grab the most recent event and use that. You might occasionally miss the correct one, but with caching you'll eventually find it and hang on to it. DVMs can do a lot of the heavy lifting with bigger caches for stuff like this to reduce computation and bandwidth client-side. In my mind, DVMs are just nostr clients that run on a server.

Yes, but you have an event id that serves as the handle, rather than an `a` tag. Replies would tag the update event, which tags the create event. Since the updates don't replace the create, the create is still accessible, so you can pull all events, or just events for a given revision. No data is being deleted, so clients don't have to guess. Right now, replies that only tag the `e` tag of a given revision get lost when the post is updated.

NIP-28 used this approach, which I believe I had a big part in developing there at the time. Now to me it looks ugly, dirty, disgusting, a very bad idea. I still don't get why referencing an initial event is better than using the "d" tag. Both are arbitrary strings ultimately. About not losing history, again, that's the same point from before: it has costs. Also if these multiple versions were treated the same way normal events are today it would break the relay query language, as if you wanted to fetch multiple statuses from people, for example, you would end up getting multiple old status for the same person and none from some others that hadn't updated in a while -- and so on.

And then again it's not very clear what we're getting from this. For example, in the "update" event approach the same problems of contact lists remains: one client can overwrite an update event from another client and people lose part of their contact lists. In the case of "delta" events then you must ensure that you have the full history, which means you must know the exact relays to where a person is publishing their deltas -- but if you are diligent enough to know that and you have successfully written more complex software able to handle that, then why can't you do the same for replaceable events today and fetch the damn last-updated contact list from a relay that you know will always have the last version before replacing it? I think your suggestion of having replaceable events + delta events (I don't remember the details) could have been a better approach actually, as it would preserve the best aspects of all worlds, but I'm not sure about the implementation complexity of it.

Your point about queries getting duplicates which crowd out some desired results is a good one. You could technically send one filter per pubkey with limit 1. Lists should not have create/update like blog posts, they should instead have set/add/remove, which combines diffs and replaces in a conflict-free way. From the perspective of event sourcing, projections should be a different layer from events. We have all this weird awkwardness because we have only one layer. I'm planning to work on some basic layer 2s via DVMs in the next month or so.

This is just a test nostr:nevent1qyd8wumn8ghj7urewfsk66ty9enxjct5dfskvtnrdakj7qgmwaehxw309aex2mrp0yh8wetnw3jhymnzw33jucm0d5hsz9thwden5te0wfjkccte9ejxzmt4wvhxjme0qy28wumn8ghj7un9d3shjctzd3jjummjvuhszymhwden5te0wp6hyurvv4cxzeewv4ej7qguwaehxw309a3ksunfwd68q6tvdshxummnw3erztnrdakj7qgnwaehxw309ahkvenrdpskjm3wwp6kytcpr9mhxue69uhhyetvv9ujuumwdae8gtnnda3kjctv9uqzqv339nqqsvqqg3ytqnwevaqx2gx3yyej9hr2sjaus82xd089uy3d8ppn9a

mutable events don’t mix well with event sourcing models. I agree with you, projections or materialized views are another layer. I wonder if the caching relay approach primal is using is effectively this second layer.

It's one way to do it, but sort of centralized because it assumes you're running a cache for the whole network. I can't imagine it would be easy to compose multiple with coverage over different parts of the network. A better interface that allows you to ad hoc query particular relays, or compose the results from multiple caches is what's needed.

I don’t think it requires the cache being global, although that’s the approach primal went with. I’m imaging a second process running near your relay that’s responsible for projecting the current state eg revision of a note. this second layer could expose the same api as a standard nostr relay to make clients just work. you could also suck this second layer logic into your relay implementation depending on your scale/reliability/ops requirements.

in a large scale distributed system analogy, new events would hit a queue. one consumer group would write the raw events to storage. another consumer group would update the projected state in the second layer when applicable.

Nothing prevents relays from storing all revisions of a replaceable event today. Would you want to do it if you were running a relay? I wouldn't, unless the user was being charged for each post individually. Would the average user be happy to pay for storing every single edit they made in a post or on their profile or every time they switched to listening to a new song and that updated their NIP-38 status?

I think a history of statuses over time would be cool. I didn't even realize that was a replaceable event. A history of profile edits would be less useful, but how often do you update your profile? The ratio of kind 1's to kind 0's would be probably around 100. Replaceable events are "good" for infrequently updated things, because otherwise you run into collisions from multiple devices updating the same list or what have you. Which means the volume isn't significant. But if the volume is, replaceables start to break.