I don't do anything to pull author from the document yet, though I will add that soon. Asciidoctor has some nice convenience methods that automatically parse fields like author out of the document. Thank you for reminding me to include author and version information in the d tag, I hadn't thought about that detail yet.
I'm thinking through d tag conventions more, and we may have to revise our pattern somewhat. The pattern `<title>-<author>-<version>` only really works for the root index of a whole book. Chapters/sections within that book won't have author and version attached to them by default. I see at least two options: 1. Carry author and version information down into the book's hierarchy, and attach it to every chapter and section index. This makes the ids of _everything_ longer (which increases URI length within our app), but it increases the likelihood of name uniqueness even of sections within a book. 2. Don't include author and version information in the d tag identifier of every chapter and section, but include that information in the tags. If we know the author and edition of the book within a chapter resides, for instance, we'll include it in the index event generated for that chapter. 2.5. If we take this latter approach, I'm interested in exploring adding additional fields to the d tag array. The question is whether the first identifier in that array is _always_ used to generate note identifiers for 30000-series events, or whether the whole d tag array is used. Perhaps we can include full author and version information in the d tag array, but not have it be mandatory for looking up an event. The details probably depend on relay indexing and search implementations, there. Let me know which option you prefer, or if you have other alternatives.
I like 2. For 2.5, is there a link to the d tag spec? The nips repo points to NIP01 but I don't see any reference to jt,
https://wikistr.com/nip-54*266815e0c9210dfa324c6cba3573b14bee49da4209a9456f9484e5106cd408a5
I think we were leaning into the definition for wiki pages, since that's the closest event kind.
Hmm. I do like 2.5 then. By default we keep the first entry as the title. Any other important searchable metadata (which can be different, depending on domain) could go there. Could this essentially combine nip36 and 54 to reduce redundancy?
Don't know. I don't fully understand 2.5. Need a map or something. 😅
I'm thinking something like: ``` "tags": [ ["d", "chapter-1", "book:example-book", "author:John Doe", "version:1.0"], // other tags... ] ```
Ah, I get it, thanks. I forget that the tags are arrays.
Y'all I searched and the NIPs repo doesn't formally define the `d` tag anywhere. That means no one can tell us we can't extend it lol. I was thinking we use positional values to extend the `d` tag array. Something like: ```json "tags": [ ["d", "war-and-peace", "leo-tolstoy", "penguin-classics-edition"] ] ``` The general format would be `["d", <title>, <author>, <edition>]` where edition is a human-readable edition name (as opposed to just a number). The event _should_ still be addressable by `#d` and the title, just like other event kinds' d tags, but we can increase address specificity for the Nostr Knowledge Base use case. Then we can bake this specificity into wikilinks (see NIP-54 for an existing wikilinks specification). A document might contain `[[War and Peace. Leo Tolstoy. Penguin Classics Edition.]]` and we can specify that clients should split that at the periods and normalize it into a d tag array reference, so it becomes `["#d", "war-and-peace", "leo-tolstoy", "penguin-classics-edition"]`. The client can then use this tag array to search its relays, find the closest match, and display to the user a hyperlink to the referenced event. Basically, we define a citation format for NKB events. Since the event ID is derived from a serialization of the whole event, including tags, increasing identifier specificity will always generate a unique event ID. We'll have to experiment with how relays index events by d tag, and how they respond to queries for such events. Maybe Stella's PHP utilities can help us with that. Worst case, relays only support searching by the first `d` tag value and return a bunch of matching results, and then we have Alexandria walk through the results and find matches by author and edition.
You think we should fine grain a difference between [[leo-tolstoy]] the topic and [[author:leo-tolstoy]] anything written by him?
Maybe using `[[author:leo-tolstoy]]` should link to a search page showing results with that `author` tag. We can use a similar format for wikilinks to search results or tag-based feeds. Perhaps you want to link to a feed of literature on a specific topic.
Yeah, just allowing for a multitude of ways for human navigation. The general case would be to remove the '{noun}:' and just find anything matching the sane string across the categories. Kind of an aside, but I'm wondering if it would make sense backend-wise if we paired an optional db for caching on relays we want to associate with to improve performance. I'd imagine without the db search performance would degrade as events increase.
The app in general probably shouldn't have its own DB, but our instance, on our website, could easily have its own DB, server, etc. to provide premium features.
GIVE ME ELASTIC SEARCH! 😂
I think the caching DB could be part of our premium offering. The client is open-source, so anyone can run an instance. The data? Not everybody will have that. We can attach a caching service to TheCitadel relay and introduce a paid tier for those who want that additional performance.
>>The index MUST also be uniquely identifiable using a combination of the d tag's first value (usually containing the title), the pubkey, and the kind fields.<< This is the part in the NKBIP-01 that pertains to the d tag. We could change the text to read >>The index MUST also be uniquely identifiable using a combination of the d tag's values (at least including the first value, usually the title), the pubkey, and the kind fields.<<
The spec doesn't assume that a 30040 is a book, that's why it's that way. It could also be an index of articles or wiki pages from different authors. Or pages from various books from various authors in various editions. That's why only the title is required, as everything has some sort of title.
I like #1, for cases where the lower-level stuff doesn't have their own tags. I had to do this, when I ran into the problem that I had two different editions of one audiobook and they kept overwriting each others' events, even though I changed the version of the 30040.
https://wikistr.com/nip-54*266815e0c9210dfa324c6cba3573b14bee49da4209a9456f9484e5106cd408a5
I think we were leaning into the definition for wiki pages, since that's the closest event kind.
Hmm. I do like 2.5 then. By default we keep the first entry as the title. Any other important searchable metadata (which can be different, depending on domain) could go there. Could this essentially combine nip36 and 54 to reduce redundancy?
Don't know. I don't fully understand 2.5. Need a map or something. 😅
I'm thinking something like: ``` "tags": [ ["d", "chapter-1", "book:example-book", "author:John Doe", "version:1.0"], // other tags... ] ```
Ah, I get it, thanks. I forget that the tags are arrays.
Y'all I searched and the NIPs repo doesn't formally define the `d` tag anywhere. That means no one can tell us we can't extend it lol. I was thinking we use positional values to extend the `d` tag array. Something like: ```json "tags": [ ["d", "war-and-peace", "leo-tolstoy", "penguin-classics-edition"] ] ``` The general format would be `["d", <title>, <author>, <edition>]` where edition is a human-readable edition name (as opposed to just a number). The event _should_ still be addressable by `#d` and the title, just like other event kinds' d tags, but we can increase address specificity for the Nostr Knowledge Base use case. Then we can bake this specificity into wikilinks (see NIP-54 for an existing wikilinks specification). A document might contain `[[War and Peace. Leo Tolstoy. Penguin Classics Edition.]]` and we can specify that clients should split that at the periods and normalize it into a d tag array reference, so it becomes `["#d", "war-and-peace", "leo-tolstoy", "penguin-classics-edition"]`. The client can then use this tag array to search its relays, find the closest match, and display to the user a hyperlink to the referenced event. Basically, we define a citation format for NKB events. Since the event ID is derived from a serialization of the whole event, including tags, increasing identifier specificity will always generate a unique event ID. We'll have to experiment with how relays index events by d tag, and how they respond to queries for such events. Maybe Stella's PHP utilities can help us with that. Worst case, relays only support searching by the first `d` tag value and return a bunch of matching results, and then we have Alexandria walk through the results and find matches by author and edition.
You think we should fine grain a difference between [[leo-tolstoy]] the topic and [[author:leo-tolstoy]] anything written by him?
Maybe using `[[author:leo-tolstoy]]` should link to a search page showing results with that `author` tag. We can use a similar format for wikilinks to search results or tag-based feeds. Perhaps you want to link to a feed of literature on a specific topic.
Yeah, just allowing for a multitude of ways for human navigation. The general case would be to remove the '{noun}:' and just find anything matching the sane string across the categories. Kind of an aside, but I'm wondering if it would make sense backend-wise if we paired an optional db for caching on relays we want to associate with to improve performance. I'd imagine without the db search performance would degrade as events increase.
The app in general probably shouldn't have its own DB, but our instance, on our website, could easily have its own DB, server, etc. to provide premium features.
GIVE ME ELASTIC SEARCH! 😂
I think the caching DB could be part of our premium offering. The client is open-source, so anyone can run an instance. The data? Not everybody will have that. We can attach a caching service to TheCitadel relay and introduce a paid tier for those who want that additional performance.
>>The index MUST also be uniquely identifiable using a combination of the d tag's first value (usually containing the title), the pubkey, and the kind fields.<< This is the part in the NKBIP-01 that pertains to the d tag. We could change the text to read >>The index MUST also be uniquely identifiable using a combination of the d tag's values (at least including the first value, usually the title), the pubkey, and the kind fields.<<
I like #1, for cases where the lower-level stuff doesn't have their own tags. I had to do this, when I ran into the problem that I had two different editions of one audiobook and they kept overwriting each others' events, even though I changed the version of the 30040.