Oddbean new post about | logout
 almost there with this... 

found one problem, the tag writer was not putting a zero byte in to indicate the tag field is empty

this was also causing the content field to not be decoded properly because it was decoding a length value of the content instead of the tag, and finding invalid bytes or even varints following it and giving up on the decode

next obvious problem has to do with a tags, that i mentioned also... a typical example:

original
```
["a",
"32123:7759fb24cec56fc57550754ca8f6d2c60183da2537c8f38108fdf283b20a0e58:a470b862-1522-4c91-bb11-ded892d8d1
61","wss://relay.wavlake.com"]
```

mangled
```
["a",
"0:3737353966623234636563353666633537353530373534636138663664326336303138336461323533376338663338313038666
4663238336232306130653538:a470b862-1522-4c91-bb11-ded892d8d161","wss://relay.wavlake.com"]
```

if you are familiar with ascii, you'd know that 37 is the number `7` and 35 is `5` so you can see that what has happened is it has failed to encode the kind prefix (which is a decimal) and it has instead decoded the hex value, which is decoded to binary in my binary encoder for a, p and e tags field 2 and for id, pubkey and signature

the a tag is the most complicated because it contains two numbers, which i have encoded as a 16 bit prefix value and a 32 byte for the npub that follows in an `a` tag.

likely again it's some mess to do with the `a` tag processing either it's writing, or not writing the varint prefix correctly... really, the a tag should not need a length prefix until the third arbitrary text field, so i've probably messed that up somewhere

note that all of the bugs i'm finding here were outliers... events with no tags are not common, and `a` tags are mostly only used with delete events. still a problem though, because for those cases it fails to decode properly, or encode properly, or both, and this is unexpected behaviour for clients

probably there will be one or two more little glitches to get through here

like, for example, maybe i should try to make a prefix code for these tag types so if it spots this prefix code it knows the field is noncompliant and stores it verbatim

nostr:nevent1qvzqqqqqqypzqnyqqft6tz9g9pyaqjvp0s4a4tvcfvj6gkke7mddvmj86w68uwe0qyghwumn8ghj7mn0wd68ytnvv9hxgtcpz4mhxue69uhhgetnwsh8yetpd3ujumr0dshsz9mhwden5te0d4kx26m49ehx7um5wgcjucm0d5hszythwden5te0dehhxarj9emkjmn99uqzpu0wdtqumzdlrwt2ctqaatylhwcuw8j6ewyw66wypth5lhhf0xfjvhkccv 
 oh yeah, the reason behind the special handling of these tags is it's a freebie compression of 50% which is exploited in my binary codec to save a lot of storage space on events

i considered the idea of storing them as the raw JSON but the extra space per event in a lot of cases exceeds 20% wasted space, for reactions, for example, it's like over 50% wasted

i'll see how it goes with the issue of incorrect values going into p/e/a tags, like exactly how many times i encounter events that have dumbshit in them like bech32 encoded entities instead of hex, those ... honestly, unless i see thousands of them i'm just gonna have the encoder reject them because any client that is still making this error now needs to die and teh user quit using it and relays not storing the events will contribute towards that end 
 getting closer, i think i've completely fixed the binary tag encode/decode

now encountering some kind of error with content fields

#nostrudel event publisher tool is very handy for me to paste in events coming out in the logs with failure to decode back from binary encoded, it verifies for me that the event is valid, and i know it's decoding it from json and back to json correctly, it's somewhere between the binary encode and decode that is wrong

probably some error in my state machine with handling  or maybe the encoder is missing out putting in important data... 

soon i think this task will be finished

nostr:nevent1qvzqqqqqqypzqnyqqft6tz9g9pyaqjvp0s4a4tvcfvj6gkke7mddvmj86w68uwe0qyghwumn8ghj7mn0wd68ytnvv9hxgtcpzamhxue69uhk6mr9dd6jumn0wd68yvfwvdhk6tcqyrqvjvdn57s59qqkt0ayhmluk7raunecfzelykvf88s0mg9as20qqrjm9dm